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EXAMPLE 5 

High-level NPTII expression facilitates efficient 
recovery of transplastomic lines by selection for 
kanamycin resistance 

The plastid genome of higher plants is a 12 0-kb to 
160-kb double- stranded DNA which is present in 1,900 to 
50,000 copies per leaf cell (Bendich, 1987). To obtain 
genetically stable transplastomic lines every one of the 
plastid genome copies (ptDNA) should be uniformly 
altered in a plant. Since integration of foreign DNA 
always occurs by homologous recombination, plastid 
transformation vectors contain segments of the plastid 
genome to target insertions at specific locations. 
Useful, non-selectable genes are cloned next to the 
selectable marker genes, which are then introduced into 
the plastid genome by linkage to the selectable marker 
gene (Maliga, 1993) . Transforming DNA is introduced into 
plastids by the biolistic process (Svab et al . , 1990; 
Svab and Maliga, 1993) or PEG treatment (Golds et al . , 
1993; O'Neil et al . , 1993). Elimination of wild-type 
genome copies occurs during repeated cell divisions on a 
selective medium. The success of transformation depends 
on the success of selective amplification of the few 
initially transformed genome copies. Therefore the 
choice of the antibiotic used for the selective 
amplification of transformed genome copies and the 
mechanism by which the plant cells are protected from 
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antibiotic action is a critical parameter to be 
considered for successful generation of homoplasmic 
plants. 

The most commonly used antibiotic for the selection 
of transplastomic lines is spectinomycin, an inhibitor 
of protein synthesis on plastid ribosomes . Initially, 
plastid transformation in tobacco was carried out by 
selection for resistance based on mutations in the 
plastid 16S rRNA (Svab et al . , 1990). Selection was 
inefficient, yielding about one transplastomic clone per 
50 bombarded samples, probably because the 16S rRNA 
based mutation in recessive. Recovery of transplastomic 
lines was enhanced -100-fold by selection for a dominant 
marker, spectinomycin resistance based on inactivation 
by aminoglycoside 3" adenyl transferase encoded in a 
chimeric aadA gene (Svab and Maliga, 1993) . In addition 
to tobacco, selection for spectinomycin resistance 
(aadA) could be applied to recover transplastomic lines 
in Arabidopsis and potato. The aadA gene in plants 
confers resistance to both spectinomycin and 
streptomycin. Selection for streptomycin resistance was 
used for plastid transformation in rice, a species 
resistant to spectinomycin, after bombardment with a 
chimeric aadA gene. See Example 8. 

The need for an alternative marker gene for plastid 
manipulation has led to testing kanamycin resistance as 
a selective marker. A chimeric neo (Jean) gene, encoding 
neomycin phosphotransferase, was suitable to recover 
transplastomic tobacco lines. However, recovery of 
transplastomic lines was relatively inefficient, 
yielding only one transplastomic line in -25 bombarded 
leaf samples. Furthermore, for every plastid 
transformation event -25 to 50 kanamycin resistant lines 
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were obtained in which integration of the plastid neo 
construct into the nuclear genome resulted in kanamycin 
resistance (Carrer et al., 1993). We report here that 
the efficiency of recovering transplastomic clones is 
5 significantly improved when transforming tobacco 
chloroplasts with a new neo gene expressed from a 
promoter with the atpB and clpP translation control 
region. The number of nuclear transformation events is 
reduced using the cassettes of the present invention. 
10 These improvements make the new neo gene a practical 
tool for plastid genome manipulations. 

DISCUSSION 

The chimeric neo genes described in Examples 1-4 

15 were introduced into plastids by selection for the 

linked spectinomycin resistance (aadA) gene as their 
suitability for directly selecting transplastomic lines 
was unknown. The transplastomic lines listed in Table 3 
were then tested for resistance to kanamycin by their 

20 ability to proliferate on a medium containing 50 mg/L 
kanamycin. The RMOP meduim used for testing induces 
formation of green callus and shoot regeneration in the 
absence of kanamycin. The tissue culture procedures 
utilized for this example are described in references 

25 Carrer et al . , 1993 and Carrer and Maliga, 1995. 

On the selctive kanamycin medium only scanty, white 
callus forms from wild-type leaf section. Formation of 
green callus and shoots from leaf section of plants 
transformed with pHK plasmids in Table 3 indicates that 

3 0 accumulation of NPTII confers kanamycin resistance. We 
set out to test if transplastomic clones can be directly 
selected by kanamycin resistance after bombardment with 
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plasmids pHK30 and pHK32. The results are summarized in 
Table 5. 

Bombardment of 25 tobacco leaves with plasmid pHK30 
yielded 45 kanamycin resistant lines on a medium 
5 containing 50 mg/L kanamycin. Transplastomic neo lines 
are expected to be resistant to much higher levels, 50 0 
mg/L of kanamycin (Carrer et al . , 1993). In addition, in 
plasmid pHK30 the neo gene is physically linked to a 
spectinomycin resistance (aadA) gene. Spectinomycin 

10 resistance is manifested as kanamycin resistance: 

sensitive leaf sections form white callus and no shoots 
whereas resistant leaf sections form green callus and 
shoots on a selective medium (500 mg/L) RMOP medium. 
We assumed therefore, that all transplastomic lines 

15 should be resistant to both 500 mg/L of kanamycin and 
500 mg/L spectinomycin (Carrer and Maliga, 1995) . When 
applying this test we found that 22 of the 45 lines meet 
these criteria. Digestion of the plastid DNA with the 
EcoRI restriction enzyme and probing with the plastid 

20 targeting region should detect 3.1-kb fragment in the 
wild-type and a 4.2-kb and 1.2-kb fragment in 
transplastomic lines (Figure 15A) . DNA gel blot analysis 
of seven of the kanamycin- spectinomycin resistant lines 
confirmed integration of both transgenes into the 
25 plastid genome (Figure 15B) . Therefore, we assume that 
all 22 kanamycin- spectinomycin lines are transplastomic 
(Table 5) . 

Bombardment of 3 0 tobacco leaves with plasmid pHK32 
yielded 28 kanamycin resistant lines on a medium 
30 containing 50 mg/L kanamycin. We have identified 11 
double- resist ant lines by testing these on a medium 
containing 50 0 mg/L of kanamycin and 50 0 mg/L 
spectinomycin. All six tested were transplastomic by DNA 
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that all eleven are cransplascomic (Table 5) . 



5 



TABLE 5 

SELECTION OF TRANSPLASTOMIC TOBACCO 
CLONES BY KANAMXCIN RESISTANCE 



pTNH32 



Kan . Res . 
5 0 mg/L 



Kan . Res . 
500 mg/L 



Kan. Res. 
500 mg/L 
Spec . Res . 
500 mg/L 



Transpiastomic 



pHK3 0 
pHK32 



( a Carrer et al., 1993) 

25 

DISCUSSION 

Plastid transformation efficiency should be 
comparable, if we target the same region of the plastid 
genome for insertion, use similar size targeting 

3 0 sequences and the same method of DNA delivery. 

Therefore, lower transformation efficiencies obtained by 
selection for kanamycin resistance with the old chimeric 
neo genes was likely due to the lack of recovery of 
tranplastomic clones by selection . We have found that 

35 transformation with neo genes expressed from the 
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PrrnLatpB+DBwt and PrrnLclpP+DBwt promoters is as 
efficient as with the aadA gene. This is a significant 
technical advance, and will facilitate plastid 
transformation in crops, in which the regenerable 
tissues contain non-green plastids. Most important 
targets are the non-green plastids of cereal crops. 
Kanamycin selection is widely used to obtain transgenic 
lines after transformation with chimeric neo genes in 
dicots. However, kanamycin is an undesirable selective 
agent in monocots such as cereal tissue cultures - 
However, NPTII also inactivates paromomycin, which may 
be used to recover nuclear gene transf ormants at an 
extremely high efficiency in cereals. See for example, 
PCT application W099/05296. 

EXAMPLE 6 

Bacterial bar gene expression in tobacco plastids 
confers resistance to the herbicide phosphinothricin 

BiaLaphos, a non- selective herbicide, is a 
tripeptide composed of two L-alanine residues and an 
analog of glutamic acid known as phosphinothricin (PPT) . 
While PPT is an inhibitor of glutamine synthetase in 
both plants and bacteria, the intact tripeptide has 
little or no inhibitory effect in vitro. Bialaphos is 
toxic for bacteria and plants, as intracellular 
peptidases remove the alanine residues and release 
active PPT. Bialaphos is produced by Streptomyces 
hygroscopicus . The bacterium is protected from 
phosphinothricin toxicity by phosphinothricin 
acetyltransf erase (PAT), the bar gene product. This 
enzyme acetylates phosphinothricin or 

demethylphosphinothricin (Thompson et al . , 1987). PPT 
resistant crops have been obtained by expressing the S. 
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hygroscopicus bar gene in the plant nucleus. Herbicide 
resistant lines were obtained by direct selection for 
PPT resistance in culture after Agrobacterium ' 
tumefaci ens-mediated DNA delivery in tobacco, potato , 
5 Brassica napus and Brassica oleracea (Be Block et al . , 
1987, 1989) . Biolistic DNA delivery of chimeric bar 
genes has been employed to obtain PPT resistant maize 
(Spencer et al . , 1990), rice (Cao, et al, 1992) and 
Arabidopsis thaliana (Sawaskaki et al., 1994). 

10 Construction of transplastomic tobacco plants, in which 
PPT resistance is based on the expression of bar from S. 
hygroscopicus in plastids is described in the present 
example. The vectors utilized to express the bar gene 
contain an exemplary chimeric 5 ' regulatory region as 

15 set forth in the previous examples . The following 

material and methods facilitate the practice of this 
aspect of the present invention. 

Construction of plastid bar gene 

20 A Ncol/Xbal bar gene fragment was generated by PCR 

amplification using plasmid of pDM302 (Cao et al., 1992) 
with the following primers : 

PI , 5 ' - AAACCATGGCACCACAAACAGAGAGCCCAGAACGACGCCC - 3 ' ; 
P2 , 5 ' -AAAATCTAGATCATCAGATCTCGGTGACG-3 ' . 

25 

The ends of the PCR fragment were blunt ended by 
treatment with the Klenow Fragment of DNA polymerase I. 
The fragment was then ligated into the EcoRV site of 
pBluescript II KS+ (Stratagene, La Jolla, CA) to create 
30 plasmid pJEK3 . Sequence analysis of pJEK3 plasmid DNA 
revealed that the Xbal site we intended to create 
through PCR amplification of pDM302 is absent. See 
Figure 19. The bar gene has the two translation 
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termination codons followed by vector sequences. The 
last 20 bp of pJEK3 are: 

CCCGTCACCGAGATCTGATCAtcgaattcctgcagcccgggggatccactagttct 
aga. The bar sequences are in capital (stop codons 
5 underlined) , the vector sequences are in lower case 
(Xbal site underlined) . Since there is an Xbal site 
present in the vector 4 0 bp from the intended Xbal site, 
it was not necessary to repair this error. The Ncol-Xbal 
fragment from plasmid pJEK3 was ligated into Ncol-Xbal 
10 digested pGS104 plasmid (Serino and Maliga, 1997) to 

generate plasmid p JEK6 . Plasmid pGS104 carries a Prrn- 
TrbcL expression cassette in a pPRVlllB plastid 
transformation vector. A map of the plastid targeting 
region of plasmid pJEK6 is shown in Figure 16A. 



Plastid transformation and plant regeneration 

Tobacco (Nicotiana tabacum cv. Petit Havana) 
plants were grown aseptically on agar-solidif led medium 

20 containing MS salts (Murashige and Skoog, 1962) and 

sucrose (30g/l) . Leaves were placed abaxial side up on 
RMOP media for bombardment. The RMOP medium consists of 
MS salts, N6-benzyladenine (lmg/1) , 1-naphthaleneacetic 
acid (0.1 mg/1) , thymine (lmg/1), inositol (100 mg/1) , 

25 agar (6g/l) , pH 5.8, and sucrose (30g/l) . The DNA was 
introduced into chloroplasts on the surface of 1/nn 
tungsten particles using the DuPont PDSlOOOHe Biolistic 
gun (Maliga 1995) . Spectinomycin resistant clones were 
selected on RMOP medium containing 500 /xg/ml 

30 spectinomycin dihydrochloride . Resistant shoots were 

regenerated on the same selective medium and rooted on 
MS agar medium (Svab and Maliga, 1993) . The 
independently transformed lines are designated by the 
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transforming plasmid <pJEK6) and a serial number, for 
example pJEK6-2, pJEK6-5. Plants regenerated from the 
same transformed line are distinguished by letters, for 
example pJEK6-2A, pJEK6-2B. 



Southern Blot Analysis 

Total cellular DNA was isolated from wild-type and 
transgenic spectinomycin resistant plants with CTAB 
(Saghai-Maroof et al . , 1984) . The DNA was digested with 
the Sma I and Bglll restriction endonucl eases, separated 
on a 0.7% agarose gel and blotted onto a Hybond-N nylon 
membrane (Amersham, Arlington Heights, IL) by a pressure 
blotter. The membrane was hybridized overnight with an 
Apal/ BamHI fragment labeled with (a- 32 P ) dCTP using a 
dCTP DNA Labeling Beads Kit (Pharmacia Inc, Piscataway, 
NJ) . The membrane was washed 2 times with 0 . IX SSPE, 
0.2X SDS at 55°C for 3 0 minutes. Film was exposed to the 
membrane for 30 minutes at room temperature. 

PAT Assay 

The PAT assay was performed as described by Spencer 
et. al. (1990) . Leaf tissue (100 mg) from wild type 
tobacco (wt) , transgenic Nt-pDM307-10 tobacco (a line 
transformed with the nuclear bar gene in plasmid pDM3 07; 
Cao et al., 1992), and plastid bar gene transf ormants 
was homogenized in 1 volume of extraction buffer (10 mM 
Na 2 HP0 4 , 10 mM NaCl) . The supernatant was collected after 
spinning in a microfuge for 10 minutes. Protein (25 mg) 
was added to 1 mg/ml PPT and 14 C-labeled Acetyl CoA. The 
reaction was incubated at 37°C for 3 0 minutes and the 
entire reaction was spotted onto a TLC plate. Ascending 
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chromatography was performed in a 3:2 mixture of 1- 
propanol and NH 4 0H. Film was exposed to the TLC plate 
overnight at room temperature . 



Herbicide Application 

Wild type and transgenic plants were sprayed with 5 
ml of a 2% solution of Liberty (AgrEvo, Wilmington, DE) 
with an aerosol sprayer. 



RESULTS AND DISCUSSION 

First the bacterial bar gene was converted into a 
plastid gene by cloning the bar coding region into a 
plastid expression cassette. This cassette consists of 
an engineered plastid rRNA operon promoter (Prrn) and 
TrbcL and the 3 ' UTR of the plastid rbcL gene for 
stabilization of the mRNA. The plastid bar gene was then 
cloned into the plastid transformation vector to yield 
plasmid pJEK6, and introduced into plastids on the 
surface of microscopic tungsten particles. The bar gene 
integrated into the plastid genome by two homologous 
recombination events via the plastid targeting 
sequences, as shown in Figure 16A. Selection for the 
linked aadA (spectinomycin resistance) gene on 
spectinomycin- containing medium eventually yielded cells 
which carried a uniformly transformed plastid genome 
population, which were then regenerated into plants. 

Integration of bar and aadA was verified by DNA gel 
blot analysis. Total cellular DNA of wild-type and 
transplastomic plants was digested with the Smal and 
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BgllX restriction enzymes and probed with the 2.9-kb 
Apal-BamHI plastid targeting fragment of N. tabacum 
{Figure 16B) . The two fragments that were expected for 
the transgenic plants, 3.3 kb and 1.9 kb, were present 
5 in each of the transplastomic samples shown in Figure 

16B. Absence of the 2 . 9 kb wild type fragment indicated, 
that by the time these plants have been regenerated, the 
wild-type plastid genome copies have been diluted out on 
the selective medium. 

10 To determine if the plastid bar gene has been 

expressed, leaf extracts were assayed for 
phosphinothricin acetyltransf erase (PAT) activity. 
Conversion of PPT into acetyl -PPT indicated PAT activity 
in each of the tested transplastomic lines. Data in 

15 Figure 17 are shown for the transplastomic lines Nt- 

pJEK6-2D, Nt-pJEK6-5A and Nt-pJEK6-13B . Interestingly, 
PAT activity was significantly (>>10-fold) higher when 
bar was expressed in the plastids, as compared to the 
bar gene expressed from the cauliflower mosaic virus 35S 

20 promoter in the nucleus of the Nt-pDM307-10 plant. 

PAT expression confers resistance to PPT in tissue 
culture and in the greenhouse. When wild type leaf 
sections are grown in tissue culture, 10 mg/L PPT 
completely blocks callus proliferation. This same PPT 

25 concentration is suitable for the selection of nuclear 
transf ormants after bombardment with the nuclear bar 
construct in plasmid pDM307. Leaf sections of plants 
expressing bar in plastids show resistance in the 
presence of up to 100 mg/L PPT in the culture medium. We 

30 have tested PPT resistance in the greenhouse, spraying 
wild-type and transplastomic plants with Liberty, a 
commercial formulation of PPT, at the recommended field 
dose of 2%. As shown in Figure 18A, 13 days after the 
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treatment, the wild type plants were dead while the 
transgenic plants thrived. Since then the sprayed plants 
have flowered and set seed. Figure 18B shows maternal 
inheritance of PPT resistance. Lack of plastid pollen 
5 transmission results in a lack of herbicide resistance 
in progeny pollinated with transgenic pollen. The 
bacterial bar gene has a high G + C content (68.3%; 
Genbank Accession No. X17220) , while plastid genes have 
a relatively high A + T content; for example the G + C 

10 content of the highly expressed psbA and rbcL genes is 
42.7% and 43.7%, respectively (Genbank Accession No. 
Z00044) . Differences in the G + C content are also 
reflected in the codon usage biases. Interestingly, data 
presented here indicate that expression of bar from S. 

15 hygroscopicus is sufficiently high to confer resistance 
to field levels of the non-selective herbicide PPT. 
Furthermore, the PAT enzyme levels obtained in the 
transplastomic lines are significantly higher than those 
observed in the nuclear transf ormant . Therefore, further 

2 0 improvement of the expression levels may be obtained by 
optimizing the codon usage for plastids as set forth in 
Example 7 . 

Advantages of incorporating bar in the plastid 
25 genome are containment of herbicide resistance due to 
the lack of pollen transmission in most crops. 
Furthermore, the lack of genetic segregation would 
simplify back-crossing for the introduction of herbicide 
resistance into additional breeding lines. 

30 

EXAMPLE 7 

A Synthetic bar gene Improves Containment and 
Enhances Expression in Plastids 
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The bacterial bar gene was introduced into the 
tobacco plastid genome by transformation with plasmid 
pJEK6, as described above in Example 6. In plasmid pJEK6 
bar is expressed in a cassette consisting of the 
Prrn(L)rbcL{S) promoter and TrbcL transcription 
terminator. This plasmid conferred PPT resistance to 
plants grown in the presence of PPT in the tissue 
culture medium, but direct selection for transformed 
lines was not possible. Although the PAT levels in 
homoplastomic leaves was high, the amount of PAT 
produced by the few pJEK6 bar copies during the early 
stage of plastid transformation was probably 
insufficient to protect the entire cell. 

To improve bar expression in plastids a synthetic 
gene was created. The codon usage was modified to mimic 
that of the average tobacco photosynthetic plastid gene. 
Changing the codon usage lead to a lowered GC content 
characteristic of higher plant plastid genes. To assist 
with cloning, restriction enzyme recognition sequences 
were removed and added as necessary. Codon usage 
frequency in bacteria reflects relative tRNA abundance: 
frequent use of codons for rare tRNAs may significantly 
reduce translation efficiency. We hoped that 
dif ferential • codon usage in plastids and bacteria would 
reduce or prevent expression of the synthetic gene in 
bacteria, thereby reducing the danger of horizontal gene 
transfer to microorganisms. We also hoped that improved 
bar expression in our novel promoter cassettes will 
allow direct selection of plastid transf ormants on PPT- 
containing medium. 

Materials and Methods for Exa mple 7 
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Codon comparisons of photosynthetic (rbcL, psaA, 
psaB, psaC, psbA, psbB, psbC, psbD, psbE, psbF) plastid 
genes were compiled using GCG (Genetics Computer Group, 
Madison, WI) . DNA mutations were then introduced into 
the bacterial bar gene making its codon usage more 
similar to plastid genes, while removing several 
restriction enzyme sites that could interfere with 
cloning. See Figure 28. The synthetic bar gene (s-bar) 
was obtained by single-step assembly of the entire s-bar 
gene from 28 oligonucleotides (one 44 nt primer, one 3 0 
nt primer and twenty- six 40 nt primers) using PCR 
(Stemmer et al . , 1995) . The top and bottom strands of 
the primers overlap with each other by 20 nucleotides. 
Ncol and Nhel sites were added at the 5' end and a Xbal 
site was added at the 3* end through PCR amplification. 
To obtain the complete s-bar gene, a small aliquot of 
the assembly PCR product was amplified using primers 1A 
and 14B. Unchanged nucleotides are in upper case, 
altered nucleotides are in lower case in the primers 
listed below. 

Primer 1A ccATGgctAGCCCAGAAaGAaGaCCGGCCGAtATtaGaCG 
Primer IB GCATaTCaGCt TC t GTaGCACGt Ct aATaTCGGCCGGt Ct 
Primer 2A TGCtACaGAaGCtGAtATGCCaGCaGTtTGtACaATCGTt 
Primer 2B CTTGTtTCtATaTAaTGGTTaACGATtGTaCAaACtGCtG 
Primer 3 A AACCAtTAtATaGAaACAAGtACaGTaAACTTtaGaACtG 
Primer 3B t TC t TGaGGTTC t TGaGGt TCaGTt Ct aAAGTTt AC t GTa 
Primer 4A AaCCtCAaGAACCtCAaGAaTGGACtGAtGAtCTaGTCCG 
Primer 4B AaGGATAGCGCTCtCGtAGACGGACtAGaTCaTCaGTCCA 
Primer 5A TCTaCGaGAGCGCTATCCtTGGCTtGTaGCaGAaGTtGAC 
Primer 5B GCGATaCCaGCtACtTCaCCGTCaACtTCtGCtACaAGCC 
Primer 6A GGtGAaGTaGCtGGtATCGCaTAtGCGGGCCCtTGGAAGG 
Primer 6B CCAaTCaTAtGCaTTtCtTGCCTTCCAaGGGCCCGCaTAt 
Primer 7A CAaGaAAtGCaTAtGAtTGGACaGCtGAaTCaACtGTtTA 
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Primer 7B GtTGaTGaCGtGGtGAaACGTAaACaGTtGAtTCaGCtGT 
Primer 8A CGTtTCaCCaCGtCAtCAaCGtACaGGACTtGGtTCtACt 
Primer 8B TTCAGtAGaTGtGTaTAtAGaGTaGAaCCaAGtCCtGTaC 
Primer 9A CTaTAtACaCAtCTaCTGAAaTCttTGGAGGCACAaGGtT 
5 Primer 9B aACAGCtACaACaCTCTTaAAaCCtTGTGCCTCCAaaGAt 

Primer 1 OA TtAAGAGtGTtGTaGCTGTtATaGGatTGCCtAAtGAtCC 
Pr imer 1 OB Ct TCaTGCATGCGt ACaCt TGGaTCaTTaGGCAat CCtAT 
PrimerllA aAGtGTaCGCATGCAtGAaGCtCTaGGATATGCtCCaaGa 
PrimerllB CCtGCaGCCCtCAaCATaCCtCttGGaGCATATCCtAGaG 
10 Primerl2A GGtATGtTGaGGGCtGCaGGtTTCAAaCAtGGaAACTGGC 

Primer 12B tTGCCAaAAACCtACaTCATGCCAGTTtCCaTGtTTGAAa 
Primerl3A ATGAtGTaGGTTTtTGGCAaCTtGAtTTCAGtCTaCCaGT 
Primerl3B GtAGaACtGGACGaGGaGGTACtGGtAGaCTGAAaTCaAG 
Primerl4A ACCtCCtCGTCCaGTtCTaCCaGTtACtGAGATCTGATGA 
15 Primerl4B tctagaTCATCAGATCTCaGTaACtG 

The amplified s-bar coding region was then cloned 
into a pBSIIKS+ plasmid (Stratagene, La Jolla, CA) and 
sequenced (Figure 2 OA) . The s-bar gene was cloned into 
cassettes with the chimeric PrrnLatpB+DBwt , 
2 0 PrrnLrbcL+DBwt and PrrnLT7gl 0+DB/Ec promoters. Table 6 
sets forth the plasmids used in the practice of this 
example. 

Table6. Plasmids with bar genes. 
25 



Plasmid 


Promoter 


bar 


3 'UTR 


Vector 


pK05 




synthetic 
(s-bar) 




pBSIIKS+ 


pK03 


PrrnLatpB+DBwt 


synthetic 
( s -bar) 


TrbcL 


pPRVlllB 


pK08 


PrrnLrbcL+DBwt 


synthetic 
( s -bar) 


TrbcL 


pPRVlllA 


pK017 


PrrnLT7glO+DB/- 

Ec 


synthetic 
(s-bar) 


TrbcL 


pPRVlllB 


pK012 


PrrnLrbcL+DBwt 


bacterial 
(bar) 


TrbcL 


pPRVlllA 
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To provide a suitable cloning site at 3 ' -end of 
the bacterial bar gene, the Eagl/Bglll fragment of s-bar 
was replaced with the cognate fragment of the bacterial 
bar coding region. Such a bacterial bar gene is 
incorporated in plasmid pK012 (Figure 21) . In plasmid 
pK012 the first 22 nucleotides of the bacterial bar 
coding region are replaced with nucleotides from the s- 
bar. 

RESULTS 

The engineered bacterial bar gene in pJEK6 is 
expressed both in E. coli and plants, as shown in the 
previous example. We were interested to test if 
modification of the codon affects expression of the s- 
bar gene in plastids and in E. coli. In E . coli, s-bar 
expression was determined by measuring PAT activity. 
Extracts were prepared from bacteria carrying plasmids 
pK03 and pKOS expressing s-bar from the PrrnLatpB+DBwt 
and PrrnLrbcL+DBwt promoters, respectively. The 
radioactive assay did not detect any activity, although 
extracts from bacteria transformed with plasmids pJEK6 
and pK012 carrying the bacterial bar genes gave strong 
signals (Figure 22A) . In plasmid pK012 the first 22 
nucleotides of the bacterial bar coding region are 
replaced with nucleotides from the s-bar. Therefore, 
lack of expression from the s-bar in E. coli is not due 
to changes within the first 22 nucleotides. 

The s-bar was also introduced into plastids by 
transformation with vector pK03 . Extracts were prepared 
from pK03- and pJEK6 -transformed tobacco plants, which 
carry the s-bar and bar genes, respectively. Extracts 
from both types of plants contained significant PAT 
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activity (Figure 22B) . Therefore, the synthetic bar is 
expressed in plastids but not in E. coli. 

Changing the Jbar gene codon usage abrogated 
expression of the gene in E. coli. This is likely due to 
the introduction of the rare AGA and AGG arginine codons 
in the s-bar coding region. The triplet frequency per 
thousand nucleotides for AGA and AGG is the lowest in E. 
coli, reflecting low abundance of the tRNA required for 
translation of these codons. The minor arginine 
tRNA Ar9<AGG/AGA> has been shown to be a limiting factor in 
the bacterial expression of several mammalian genes. The 
coexpression of ArgU (dnaY) gene that encodes for 
tRNA Arg(AGG/AGA) resulted in high level production of the 
target protein (Makrides 1996) . The bacterial bar gene 
has 14 arginine codons, none of which are the rare 
AGA/ AGG codons. The s-bar gene has five of them, three 
of which are located within the first 25 codons. 
Therefore, the likely explanation for the lack of s-bar 
expression in E. coli is introduction of the rare AGA 
and AGG arginine codons in the s-bar coding region. 

There are proteins, which are toxic to E. coli but 
their expression is desirable in plastid to which it is 
not toxic. Engineering of these proteins in E. coli 
poses a problem, since the commonly used PEP plastid 
promoters are active in E. coli, thus the gene will be 
transcribed and the mRNA translated. Incorporation of 
minor codons in the coding region will prevent 
translation of these proteins in E. coli. Particularly 
useful in this regard is conversion of arginine codons 
to AGA/ AGG . If no arginine is present in the N-terminal 
region, an N-terminal fusion may be designed containing 
multiple AGA/ AGG codons to prevent translation of the 
mRNA. 
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Plants under field conditions are associated with 
microbes living in the soil, on the leaves and inside 
the plants. Gene flow from plastids to these 
microorganisms has not been shown. However, it would be 
an added safety measure to incorporate codons in pi as t id 
genes, which are rare in the target microorganisms, but 
are efficiently translated in plastids. Incorporation of 
AGA/AGG codons into the selective marker genes and the 
genes of interest will prevent transfer of genes from 
plants to microbes, which lack the capacity to 
efficiently translate the AGA/AGG codons. In case of 
specific plant -microbe associations, based on 
differences in codon usage preferences genes could be 
designed which would be expressed in plastids but not in 
microbes . 

Attempts to directly select transplastomic clones 
after bombardment with the s-har constructs so far has 
failed. The s-har coding region in Figure 2 OA contains 
frequent and rare codons in proportions characteristic 
of plastid genes. It is possible, that relatively rare 
codons in a specific context at a critical stage will 
prevent recovery of plastid transformation events. 
Examples for tissue-specific translation of mRNAs 
dependent on tRNA availability are known (Zhou et al . , 
1999) . Therefore, we designed a second synthetic bar 
gene, S2-bar, containing only frequent codons (Figure 
20B) . Plastid transformation with the s2-bar will enable 
direct selection of plastid transformation events by PPT 
resistance . 

EXAMPLE 8 

FLUORESCENT ANTIBIOTIC RESISTANCE MARKER FOR FACILE 
IDENTIFICATION OF TRANSPLASTOMIC CLONES IN TOBACCO AND 
RICE 
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Plastid transformation in higher plants is 
accomplished through a gradual process, during which all 
the 300-10,000 plastid genome copies are uniformly 
altered. Antibiotic resistance genes incorporated in the 
5 plastid genome facilitate maintenance of transplastomes 
during this process. Given the high number of plastid 
genome copies in a cell, transformation unavoidably 
yields chimeric tissues, in which the transplastomic 
cells need to be identified and regenerated into plants. 

10 In chimeric tissue, antibiotic resistance is not cell 
autonomous: transplastomic and wild-type sectors both 
are green due to phenotypic masking by the transgenic 
cells. Novel genes encoding FLARE - S , a fluorescent 
antibiotic resistance enzyme conferring resistance to 

15 spectinomycin and streptomycin, which were obtained by 
translationally fusing aminoglycoside 3' 1 - 
adenylyltransf erase [AAD] with the Aeguorea victoria 
green fluorescent protein (GFP) are provided in the 
present example. FLARE -S facilitates distinction of 

2 0 transplastomic and wild- type sectors in the chimeric 
tissue, thereby significantly reducing the time and 
effort required to obtain genetically stable 
transplastomic lines. The utility of FLARE -S to select 
for plastid transformation events was shown by tracking 

25 segregation of transplastomic and wild-type plastids in 
tobacco and rice plants after transformation with FLARE- 
S plastid vectors and selection for resistance to 
spectinomycin and streptomycin, respectively. 

Plastid transformation vectors contain a selectable 

30 marker gene and passenger gene(s) flanked by homologous 
plastid targeting sequences (Zoubenko et al . , 1994), and 
are introduced into plastids by biolistic DNA delivery 
(Svab et al . , 1990; Svab and Maliga, 1993} or PEG 
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treatment (Golds et al., 1993; Koop et al . , 1996; 
O'Neill et al., 1993). The selectable marker genes may 
encode resistance to spectinomycin, streptomycin or 
kanamycin. Resistance to the drugs is conferred by the 
5 expression of chimeric aadA (Svab and Maliga, 1993) and 

neo (Jean) (Carrer et al., 1993) genes in plastids. These 
drugs inhibit chlorophyll accumulation and shoot 
formation on plant regeneration media. The 
transplastomic lines are identified by the ability to 

10 form green shoots on bleached wild- type leaf sections. 
Obtaining a genetically stable transplastomic line 
involves cultivation of the cells on a selective medium, 
during which the cells divide at least 16 to 17 times 
(Moll et al., 1990). During this time wild type and 

15 transformed plastids and plastid genome copies gradually 
• sort out. The extended period of genome and organellar 
sorting yields chimeric plants consisting of sectors of 
wild-type and transgenic cells (Maliga, 1993) . In the 
chimeric tissue antibiotic resistance conferred by aadA 

20 or neo is not cell autonomous: transplastomic and wild- 
type sectors are both green due to phenotypic masking by 
the transgenic tissue. Chimerism necessitates a second 
cycle of plant regeneration on a selective medium. In 
the absence of a visual marker this is an inefficient 

25 process, involving antibiotic selection and 

identification of transplastomic plants by PCR or 
Southern probing. The feasibility of visual 
identification of transformed sectors greatly reduces 
the effort required to obtain homoplastomic clones. 

30 The Aequorea victoria green fluorescent protein 

(GFP) is a visual marker, allowing direct imaging of the 
fluorescent gene product in living cells without the 
need for prolonged and lethal histochemical staining 
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procedures. Its chromophore forms autocatalytically in 
the presence of oxygen and fluoresces green when 
absorbing blue or UV light (Prasher et al., 1992; 
Chalfie et al., 1994; Heim et al., 1994) (reviewed in 
5 ref. Prasher, 1995; Cubitt et al . , 1995; Misteli and 

Spector, 1997) . The gfp gene was modified for expression 
in the plant nucleus by removing a cryptic intron, 
introducing mutations to enhance brightness and to 
improve GFP solubility {Pang et al . , 1996; Reichel et 

10 al., 1996; Rouwendal et al., 1997; Haseloff et al., 

1997; Davis and Vierstra, 1998) . GFP was used to monitor 
protein targeting to nucleus, cytoplasm and plastids 
from nuclear genes (Sheen et al., 1995; Chiu et al . , 
1996; Kshler et al., 1997), and to follow virus movement 

15 in plants (Baulcombe et al . , 1995; Epel et al. , 1996). 
GFP has also been used to detect transient gene 
expression in plastids (Hibberd et al . , 1998). 

The expression of GFP by directly incorporating the 
gfp gene in the plastid genome is described herein. 

20 Incorporation of a visual marker, the GFP protein, in 
the plastid transformation vectors of the present 
invention facilitates distinction of spontaneous 
antibiotic resistant mutants and plastid transf ormants 
(Svab et al., 1990). Furthermore, transplastomic sectors 

25 in the chimeric tissue can be visually identified, 

significantly reducing the time and effort required for 
obtaining genetically stable transplastomic lines. The 
utility of the GFP marker described here is further 
enhanced by its fusion with the enzyme aminoglycoside 

30 3 ■ 1 -adenylyl transf erase [AAD] conferring spectinomycin 
and streptomycin resistance to plants. Using a marker 
gene encoding a bifunctional protein, FLARE -S 
(fluorescent antibiotic resistance enzyme, spectinomycin 
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and streptomycin) , prevents physical separation of the 
two genes and simplifies engineering. Furthermore, 
fluorescent antibiotic resistance genes enables 
extension of plastid transformation to cereal crops, in 
5 which plastid transformation is not associated with a 
readily identifiable tissue culture phenotype. 

The following protocols are provided to 
facilitate the practice of the present example. 

10 Construction of tobacco plastid vectors. The 

aadA16gfp gene encodes FLARE16-S fusion protein, and 
can be excised as an Nhel-Xbal fragment from plasmid 
pMSK51, a pBSKSII+ derivative (Genbank Accesssion No. 
Not yet assigned . The fusion protein was obtained by 

15 cloning gfp (from plasmid pCD3-326F) downstream of aadA 
(in plasmid pMSK38) , digesting the resulting plasmid 
with BstXI (at the 3 ' end of the aadA coding region) and 
Ncol (including the gfp translation initiation codon) 
and linking the two coding regions by a BstXI -Ncol 

20 compatible adapter. The adapter was obtained by 

annealing oligonucleotides 5 1 -GTGGGCAAAGAACTTGTTGAA 
GGAAAATTGGAGCTAGTAGAAGGTCTTAAAGTCGC-3 ' and 5'- 
CATGGCGACTTTAAGACCTTCTACTAGCTCCAATTTTCCTTCAACAAGTTCTTTGC 
CCACTACC-3 1 . The adapter connects AAD and GFP with a 

25 peptide of 16 amino acid residues ( ELVEGKLELVEGLKVA) . 

The engineered aadA gene (Chinault et al., 
1986) in plasmid pMSK38 (pBSIIKS+ derivative) has Ncol 
and Nhel sites at the 5 1 end and BstXI and Xbal sites at 
the 3' end of the gene. The Ncol site includes the 

30 translation initiation codon; the Nhel and BstXI sites 
are in the coding region close to the 5' and 3' ends, 
respectively; the Xbal site is downstream of stop codon. 
The mutations were introduced by PCR using 
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oligonucleotides 5 1 - 

GGCCATGGGGGCTAGCGAAGCGGTGATCGCCGAAGTATCG-3 ' and 5 ' - 
CGAATTCTAGACATTATTTGCCCACTACCTTGGTGATCTC- 3 ' . . 

The gfp gene in plasmid CD3-326F is the 
5 derivative of plasmid psmGFP, encoding the soluble 

modified version of GFP (accession number U704 95) 
obtained under order number CD3-32 6 from the Arabidopsis 
Biological Resource Center, Columbus, OH (Davis and 
Vierstra, 1998) . The gfp gene in plasmid CD3-326F is 

10 expressed in the PpsbA /TpsbA expression cassette. The 
gfp gene in plasmid CD3-326F was obtained through the 
following steps. The BamHI-SacI fragment from CD3-326 
was cloned into pBSKS+ vector to yield plasmid CD3-326A. 
The SacI site downstream of the coding region was 

15 converted into an Xbal site by blunting and linker 

ligation (5 ' -GCTCTAGAGC; plasmid CD3-326B). An Ncol site 
was created to- include the translation initiation codon 
and at the same time the internal Ncol site was removed 
by PCR amplification of the coding region N- terminus 

20 with primers 5 ' - 

CCGGATCCAAGGAGATATAACACCATGGCTAGTAAAGGAGAAGAACTTTTC - 3 ' 
and 5 1 -GTGTTGGCCAAGGAACAGGTAGTTTTCC-3 1 . The PCR- 
amplified fragment was digested with BamHI and MscI 
restriction enzymes, and the resulting fragment was used 

25 to replace the BamHI-MscI fragment in plasmid CD3-326B 

to yield plasmid CD3-32 6C. The gfp coding region was 
excised from plasmid CD3-326C as an Ncol -Xbal fragment 
and cloned into a psbA cassette to yield plasmid CD3- 
326D. PpsbA and TpsbA are the psbA gene promoter and 

30 3 1 - untranslated region derived from plasmids pJS25 

(Staub and Maliga, 1993) . TpsbA has been truncated by 
inserting a Hindi I I linker downstream of the modified 
BspHI site (Peter Hajdukiewcz, unpublished) . The 
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PpsbA: :gfp: :TpsbA gene was excised as an EcoRI-Hindlll 
fragment and cloned into EcoRI and Hindu I digested 
pPRVlllA, to yield plasmid CD3-326F. 

The aadA16gfp coding region from plasmid pMSK51 was 
5 introduced into two expression cassettes. In plasmid 

pMSK53 the aadA16gfp coding region is expressed in the 
PrrnLrbcL+DBwt /TpsbA cassette, and encodes the FLARE16- 
S2 protein (fluorescent antibiotic resistance enzyme, 
spectinomcyin) . PrrnLrbcL+DBwt is described in the 

10 previous examples and derives from plasmid pHK14 . The 
construct contains a chimeric promoter composed of the 
rrn operon promoter, the rbcL gene leader and downstream 
box sequence. TpsbA is the psbA gene 3' untranslated 
region, and functions to stabilize the chimeric mRNA. In 

15 plasmid pMSK54 the aadA16gfp coding region is expressed 
in the PrrnLatpB+DBwt /TpsbA cassette, and encodes the 
FLARE16 -SI protein. PrrnLatpB+DBwt derives from plasmid 
pHKlO, and is a chimeric promoter composed of the rm 
operon promoter, the atpB leader and downstream box 

20 sequence. See Examples 1-4. 

The chimeric aadA16gfp genes were introduced 
into the tobacco plastid transformation vector pPRVlllB 
(Zoubenko et al . , 1994). The aadA gene was excised from 
plasmid pPRVlllB with EcoRI and Spel restriction 

2 5 enzymes, and replaced with the EcoRI -Spel fragment from 

plasmids pMSK53 and pMSK54 to generate plasmids pMSK57 
(aadA16gfp-S2) and pMSK56 (aadA16gfp-Sl) . 

Construction of rice plastid vectors. Plasmid 

3 0 pMSK49 is a rice- specif ic plastid transformation vector 

which carries the aadAllgfp-S3 gene as the selective 
marker in the trnV/ rpsl2/7 intergenic region {GenBank 
Accession Number: Not yet assigned) . Plasmid pMSK49 
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carries the rice Smal-SnaBI plastid fragment 
(restriction sites at nucleotides 122488 and 125 878 in 
the genome Hiratsuka et al . , 1989) cloned into a 
pBSKSII+ (Stratagene) vector after blunting the SacI and 
Kpnl restriction sites. The Xbal site present in the 
rice plastid DNA fragment (position at nucleotide 12 5032 
in the genome (Hiratsuka et al . , 1989) was removed by 
filling in and religation. Prior to cloning the 
selective marker the progenitor plasmid was digested 
with the Bglll restriction enzyme giving rise to a 
deletion of 119 nucleotides between two proximal Bglll 
sites (positions at 124367 and 124491) . The aadAllgfp-S3 
gene was then cloned in the blunted Bglll sites. 

The aadA gene in plasmid pMSK49 was obtained by 
modifying the aadA gene in plasmid pMSK38 (above) to 
obtain plasmid pMSK39. The modification involved 
translationally fusing the aadA gene product at its N- 
terminus with an epitope of the human c-Myc protein 
(amino acids 410-419; EQKLISEEDL Kolodziej and Young, 
1991) . The genetic engineering was performed by ligating 
an adapter obtained by annealing complementary 
oligonucleotides with appropriate overhangs into Ncol- 
Nhel digested pMSK38 plasmid. The oligonucleotides were: 
5 ' - CATGGGGGCTAGCGAACAAAAACTCATTTCTGAAGAAGACTTGC - 3 ' and 
5 ' - CTAGGCAAGTCTTCTTCAGAAATGAGTTTTTGTTCGCTAGCCCC - 3 1 . 

The aadAllgfp gene encoding FLARE11-S was obtained 
by linking AAD and GFP with the 11-mer peptide 
ELAVEGKLEVA . To clone aadA and gfp in the same 
polycloning site, gfp (EcoRI-Hindlll fragment; from 
plasmid CD3-326F) was cloned downstream of aadA in 
plasmid pMSK39 to obtain plasmid pMSK41. The two genes 
were excised together as an Nhel-Hindlll fragment, and 
cloned into plasmid pMSK45 to replace a kanamycin- 
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resistance gene yielding plasmid pMSK48. Plasmid pMSK45 
is a derivative of plasmid pMSK35 which carries the 
PrrnLT7glO+DB/Ec promoter. The promoter consists of the 
plastid rRNA operon promoter and the leader sequence of 
5 the T7 phage gene 10 leader. In plasmid pMSK48, aadA is 
expressed from the PrrnLT7glO+DB/Ec promoter . The aadA 
and gfp genes were then translationally fused with an 
BstXI-Ncol adapter that links the AAD and GFP with an 
11-mer peptide. The adapter was obtained by annealing 

10 oligonucleotides 5'- 

GTGGGCAAAGAACTTGCAGTTGAAGGAAAATTGGAGGTCGC-3 ' and 5'- 
CATGGCGACCTCCAATTTTCCTTCAACTGCAAGTTCTTTGCCCACTACC-3 ' , 
which was ligated into BstXI/Ncol digested 'pMSK4 8 
plasmid DNA to yield plasmid pMSK49. Plasmid pMSK49 has 

15 the rice plastid targeting sequences present in plasmid 
pMSK35. 

Tobacco plastid transformation. Tobacco leaves from 
4 to 6 weeks old plants were bombarded with DNA- coated 
tungsten particles using the Dupont PDSlOOOHe Biolistic 

20 gun (1100 psi) . Transplastomic clones were identified as 
green shoots regenerating on bleached leaf sections on 
RMOP medium containing 500mg/L spectinomycin 
dihydrochloride (Svab abd Maliga, 1993) . The 
spectinomycin resistant shoots were illuminated with UV 

25 light (Model B 10 0AP, UV Products, Upland, California, 
USA) . Shoots emitting green light were transferred to 
spectinomycin free MS medium (Murashige and Skoog, 1962) 
(3% sucrose) on which fluorescent (transplastomic) and 
non- fluorescent (wild-type) sectors formed. Fluorescent 

30 sectors were excised, and transferred to selective (500 
mg/L spectinomycin) shoot regeneration (RMOP) medium. 
Regenerated shoots were tested for uniform 
transformation by Southern analysis. 
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Rice plastid transformation. Callus formation from 
mature Oryza sativa cv. Taipei 309 seeds was induced on 
a modified CIM medium (Tompson et al . , 1986) containing 
MS salts and vitamins (2 mg/L glycine, 0.5 mg/L 
5 nicotinic acid, 0.5 mg/L pyridoxine and 0.1 mg/L 

thiamine), 2 mg/L 2,4D, 1 mg/L kinetin and 300 mg/L 
casein enzymatic hydrolysate Type III (Sigma C-1026) and 
sucrose (30g/L) . Embryogenic suspensions from the 
proliferating embryogenic calli were obtained on the AA 

10 medium (Muller and Grafe, 1978) . For plastid 

transformation by the biolistic process rice embryogenic 
cells were plated on a filter paper on non- selective 
modified CIM medium {Tompson et al., 1986) . The 
bombarded cells were incubated for 4 8 hours, transferred 

15 to selective liquid AA medium (Muller and Grafe, 1978) 
(one to two weeks) , and then to solid modified RRM 
regeneration medium (Zhang and Wu, 1988) containing MS 
salts and vitamins, 100 mg/L myo-inositol , 4 mg/L BAP, 
0.5 mg/L IAA, 0.5 mg/L NAA, 30 g/L sucrose and 40 g/L 

20 maltose and 100 mg/L streptomycin sulfate on which green 
shoots appeared in two to three weeks . The shoots were 
rooted on a selective MS salt medium (Murashige and 
Skoog, 1962) containing 30 g/L sucrose and 100 mg/L 
streptomycin sulfate. Leaf samples for PCR analysis and 

25 confocal microscopy were taken from plants on selective 
medium . 

PCR amplification of border fragments. Total 
cellular DNA was extracted according to Mettler 
30 (Mettler, 1987) . The PCR analysis was carried out with a 

9:1 mixture of AmpliTaq (Stratagene) and Vent (New 
England Biolabs) DNA polymerases in the Vent buffer 
following the manufacturer's recommendations. The left 
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border fragment was amplified with primers 03 (5'- 
ATGGATGAACTATACAAATAAG- 3 ' and 04 ( 5 1 - GCTCCTATAGTGTGACG - 
3') . The right border fragment was amplified with 
primers 05 (5 1 -ACTACCTCTGATAGTTGAGTCG-3 ' ) and 06 (5'- 
AGAGGTTAATCGTACTCTGG-3 ' ) . The aadA part of FLARE -S genes 
was amplified with primers 01 (5 1 - 
GGCTCCGCAGTGGATGGCGGCCTG-3 ' ) and 02 (5'- 
GGGCTGATACTGGGCCGGCAGG - 3 ' ) . Primer positions are shown 
in Fig. 5A. Note that the same primers can be used in 
transplastomic tobacco and rice plants expressing FLARE- 
S. 

Detection of FLARE - S by fluorescence. FLARE -S 
expressing sectors in the leaves were visualized by an 
Olympus SZX stereo microscope equipped for GFP detection 
with a CCD camera system. Subcellular localization of 
GFP was verified by laser- scanning confocal microscopy 
(Sarastro 2000 Confocal Image System, Molecular 
Dynamics, Sunnyvale, CA) . This system includes an argon 
mixed gas laser with lines at 488 and 568 nm and 
detector channels. The channels are adjusted for 
fluorescein and rhodamine images. GFP fluorescence was 
detected in the FITC channel (488-514 nm) . Chlorophyll 
fluorescence was detected in the TRITC channel (560-580 
nm) . The images produced by GFP and chlorophyll 
fluorescence were viewed on a computer screen attached 
to the microscope and processed using the Adobe 
PhotoShop software. 

Iinmunoblot analysis. Leaves (0.5 g) collected from 
plants in sterile culture were frozen in liquid nitrogen 
and ground to a fine powder in a mortar with a pestle. 
For protein extraction the powder was transferred to a 
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centrifuge tube containing 1 ml buffer [50 mM Hepes/KOH 
(pH 7.5), 1 mM EDTA, 10 mM potassium acetate, 5 mM 
magnesium acetate, 1 mM dithiothreitol and 2 mM PMSF] 
and mixed by flicking. The insoluble material was 
removed by centrifugation at 4°C for 5 min at 11,600 g. 
Protein concentration in the supernatant was determined 
using the Biorad protein assay reagent kit. Proteins (20 
lil per lane) were separated in 12% SDS-PAGE (Laemmli, 
1970) . Proteins separated by SDS-PAGE were transferred 
to a Protran nitrocellulose membrane (Schleicher and 
Schuell) using a semi -dry electroblotting apparatus 

(Bio-Rad) . The membrane was incubated with Living Colors 
Peptide Antibody (Clontech) diluted 1 to 200. FLARE -S 
was visualized using ECL chemil luminescence immunoblot 
detection on X-ray film. FLARE-S on the blots was 
quantified by comparison with a dilution series of 
commercially available purified wild-type GFP 

(Clontech) . 

RESULTS AND DISCUSSION 
Tobacco plastid vectors with FLARE-S as the 
selectable marker. 

Two FLARE-S fusion proteins were tested in E. coll. 
In one, the AAD and GFP were linked by an 11-mer 
(ELAVEGKLEVA) , in the second by a 16-mer 
(ELVEGKLELVEGLKVA) linker. For transformation in 
tobacco, the aadA16gfp coding region (16-mer linker) was 
expressed in two cassettes known to mediate high levels 
of protein accumulation in plastids. Both utilize the 
strongest known plastid promoter driving the expression 
of the ribosomal RNA operon (Prrn) , and the 3 ' -UTR of 
the highly expressed psbA gene (TpsbA) for the 
stabilization of the chimeric mRNAs. The PrrnLatpB+wtDB 
(plasmid pMSK56) and PrrnLrbcL+DBwt (plasmid pMSK57) 
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promoters utilize the atpB or rbcL gene leader sequences 
and the coding region N- termini with the downstream box 
(DB) sequence, respectively. Due to inclusion. of the DB 
sequence in the chimeric genes, the proteins encoded by 
5 the two genes are slightly different, having 14 amino 
acids of the ATP-ase p subunit (a.tpB gene products) or 
ribulose 1, 5-bisphosphate carboxylase/ oxygenase (rbcL 
gene product) translationally fused with FLARE16-S 
(FLARE16-S1 and FLARE16-S2, respectively). To obtain a 

10 plastid transformation vector with the fluorescent 

spectinomycin resistance genes, the chimeric genes were 
cloned into the trnV/rpsl2/7 plastid intergenic region 
in plastid vector pPRVlllB. Plasmids pMSK56 and pMSK57 
(Fig. 23) express FLARE16-S1 and FLARE16-S2, 

15 respectively, as markers. 

Identification of transplastomic tobacco clones by 
fluorescence. Transformation was carried out by 
biolistic delivery of pMSK56 and pMSK57 plasmid DNA into 
chloroplast. The bombarded leaves were transferred onto 
selective (500 mg/L spectinomycin) shoot regeneration 
medium. Wild-type leaves on this medium bleach and form 
white callus. Cells with transformed plastids regenerate 
green shoots. The leaves on the selective medium were 
regularly inspected with a hand-held long-wave UV lamp 
for FLARE -S fluorescence. 

No fluorescence could be detected in young shoots 
(3 to 5 mm in size) developing on pMSK5 6 -bombarded 
leaves. However, formation of bright sectors in the 
leaves was observed, when these small shoots were 
transferred onto non-selective plant maintenance medium. 
In contrast, cultures bombarded with plasmid pMSK57 
yielded small fluorescent shoots at an early stage. 
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These fluorescent shoots, and some of the non- 
fluorescent ones, developed into plants with bright 
sectors on non- selective plant maintenance medium. 
Therefore, FLARE16-S2 is useful for early detection of 
plastid transformation events. FLARE16-S2 fluorescence 
in young shoots on a selective medium should be due to 
relatively high levels of FLARE16-S2 . Higher levels of 
FLARE16-S2 are also indicated by the brighter sectors in 
variegated leaves expressing FLARE16-S2 as compared to 
FLARE16-S1. 

The size of sectors was different in individual 
shoots. FLARE -S expression in different leaf layers was 
also obvious. With the traditional selection for 
spectinomycin resistance, the transplastomic and wild- 
type sectors are not visible. Regeneration of plants 
with uniformly transformed plastid genomes was greatly 
facilitated by the fluorescing sectors expressing FLARE - 
S, which could be readily identified in UV light, 
dissected, and transferred for a second cycle of plant 
regeneration on spectinomycin-containing (50 0 mg/L) 
selective medium. 

Given the high levels of FLARE-S accumulation we 
were interested to find out, if FLARE-S is toxic to 
plants. We expected that toxicity should be manifested 
as lower transformation efficiencies. Bombardment of 30 
tobacco leaves with plasmids pMSK56 and pMSK57 yielded 
71 and 89 spectinomycin resistant clones, respectively. 
Out of these, 61 and 77 lines were verified as 
transplastomic by fluorescence. Plastid transformation 
in a subset of these was confirmed by confocal laser 
scanning microscopy (7 clones each; see below) and 
Southern analysis (4 clones) . The frequency of plastid 
transformation events with the FLARE-S -expressing genes 
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was slightly higher (-2 instead of ~1 per bombardment) 
than reported earlier with a chimeric aadA gene at the 
same insertion site (Svab and Maliga, 1993) . Therefore, 
we assume that accumulation of FLARE - S at high levels is 
not detrimental. Lack of toxicity is also supported by 
the apparently normal phenotype of the plants in the 
greenhouse (not shown) . 

Localization of FLARE - S to tobacco plastids by 
confocal microscopy. Due to phenotypic masking, 
transplastomic and wild type sectors in a chimeric leaf 
are both green on -a selective medium. However, we have 
found that in chimeric leaf sectors in the same cell 
some plastids express FLARE -S while others do not, when 
observed by confocal microscopy (Fig. 24) . FLARE-S and 
chlorophyll fluorescence were detected separately in the 
fluorescein and rhodamine channels, respectively. The 
two images were then overlaid confirming that FLARE-S 
fluorescence derives from chloroplasts . 

Expression of FLARE-S was also studied in non- 
green plastid types including the chromoplasts in petals 
and the non-green plastids in root cells (Fig. 24b, f) . 
These studies were carried out in plants, which were 
homoplastomic for the transgenomes . Homoplastomic state 
was important, since in non-green tissues chlorophyll 
could not be used for confirmation of the organelles as 
plastids. Since FLARE-S expression could be readily 
detected in chloroplasts as well as non-green plastids, 
the plastid rRNA operon promoter is apparently active in 
all plastid types. 

FLARE-S accumulation in tobacco leaves. 

Accumulation of FLARE-S in homoplastomic leaves was 
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tested using the commercially available GFP antibody, 
recognizing the GFP portion (239 amino acid residues) of 
FLARE16 -S (520 amino acids) . FLARE16-S1 (532 amino 
acids) was -8 %, whereas FLARE16-S2 (532 amino acids) 
5 was -18 % of total soluble leaf protein (Fig. 25) . To 
calculate FLARE16-S concentrations, a GFP dilution 
series was used as a reference, and the values were than 
increased by 2.6 to correct for the larger size of the 
FLARE16-S1 and -S2 proteins. 

10 

Tracking plastid transformation in rice by FLARE- S 
expression. In rice, plant regeneration is from non- 
green embryogenic cells. Encouraged by FLARE -S 
expression in non-green tobacco plastids, we attempted 

15 to transform the non-green plastids of embryogenic rice 
tissue -culture cells. Plastid transformation was carried 
out using a rice- specif ic vector expressing FLARE11 -S3 
and targeting insertion of the aadAllgfp-S3 gene in the 
trnV/ rpsl2/7 intergenic region. The location of the 

20 insertion site and the size of plastid targeting 

sequences in the rice vector are similar to the tobacco 
vectors shown in Fig. 23. 

Plastid transformation in rice was carried out 
by bombardment of embryogenic rice suspension culture 

25 cells using gold particles coated with plasmid pMSK49 
DNA. Rice cells, as most cereals, are naturally 
resistant to spectinomycin (Fromm et al . , 1987). FLARE - 
S, however, confers resistance to streptomycin as well 
(Svab and Maliga, 1993) . Therefore, selection for 

30 transplastomic lines was carried out on selective 

streptomycin medium (100 mg/L) . Streptomycin at this 
concentration inhibits the growth of embryogenic rice 
cells. After bombardment, the rice cells were first 
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selected in liquid embryogenic AA medium, then on the 
solid plant regeneration medium, on which the surviving 
resistant cells regenerated green shoots (12 in 25 
bombarded plates) . These shoots were rooted, and grown 
into plants. PCR amplification of border fragments in 
DNA isolated from the leaves of these plants confirmed 
integration of aadAllgfp-S3 sequences in the plastid 
genome (Fig. 26) . The left and right border fragments 
can not be amplified if the gene is integrated into the 
nuclear genome, as one of the primers (04 or 06) of the 
pairs is outside the plastid targeting regions. 

FLAREll -S3 expression in the leaves of two of 
the PCR-positive plants was tested by confocal laser- 
scanning microscopy. In rice, as in tobacco, the FLARE -S 
marker confirmed segregation of transplastomic and wild- 
type plastids (Fig. 27) . In rice only a small fraction 
of chloroplasts expressed FLARE-S. Since individual 
cells marked with arrows in Fig. 27 contained a mixed 
population of wild- type and transgenic chloroplasts, 
FLARE - S in these cells could be expressed only from the 
plastid genome. Integration of aadAllgfp-S3 into the 
nuclear genome downstream of plastid- targeting transit 
peptide would result in uniform expression of FLARE -S in 
each of the chloroplasts within the cell . 

The sequences of the selectable marker genes of the 
invention are provided in Figures 28-34. Figure 35 
depicts a table describing the selectable marker genes 
disclosed in the present example. 

Direct visual identification of transplastomic 
sectors requires high level expression of FLARE- S in 
plastids. High GFP expression levels in Arabidopsis were 
toxic, interfering with plant regeneration. Toxicity -of 
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wild- type (insoluble) GFP was linked to GFP accumulation 
in the nucleus and cytoplasm, and could be eliminated by 
targeting it to the endoplasmic reticulum (Haseloff et 
al. , 1997) . GFP aggregates were also cytotoxic to E. 
5 co2i cells (Craraeri et al., 1996). To enhance 

fluorescence intensity and to avoid cytotoxicity, 
soluble versions of the codon-modif ied GFP were obtained 
(Davis and Vierstra, 1998) . We have utilized the gene 
for a soluble-modified GFP described by Davis and 

10 Vierstra (Davis and Vierstra, 1998) to create variants 
of FLARE- S, a fusion protein, which does not have an 
apparent cytotoxic effect . The frequency of plastid 
transformation, if affected at all, is increased rather 
then decreased. In tobacco, we normally obtain one 

15 transplastomic clone per bombarded leaf sample (Svab and 
Maliga, 1993) , whereas with the FLARE - S genes on average 
we could recover two clones per sample. Plant 
regeneration from highly fluorescent tissue was readily 
obtained, and the regenerated plants have a phenotype 

20 indistinguishable from the wild type. 

Plastid transformation in rice requires expression 
of the selective marker in non-green plastids. The rRNA 
operon has two promoters, one for the eubacterial - type 
(PEP) and one for the phage-type (NEP) plastid RNA 

25 polymerase. The promoter driving FLARE - S expression is 
recognized only by the eubacterial -type plastid RNA 
polymerase. Previously, it was assumed that the 
eubacterial -type promoter is active only in chloroplasts 
(Maliga, 1998) . Accumulation of FLARE - S in roots and 

30 petals indicates that PEP is also active in non-green 
plastids . 

Plastid transformation is a process that 
unavoidably yields chimeric plants, since cells of 
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higher plants contain a large number (300 to 50000) of 
plastid genome copies (Bendich, 1987) , out of which 
initially only a few are transformed. High level 
expression of FLARE -S in plastids provides the means for 
visual identification of transplastomic sectors, even if 
they are present in a chimeric tissue. GFP and AAD could 
be expressed from two different genes in a plastid 
transformation vector. However, transformation with a 
marker gene encoding a bifunctional protein prevents 
separation of the two genes and simplifies engineering. 
The fluorescent selective marker will significantly 
reduce the work required to obtain genetically stable 
plastid transformants in tobacco, a species in which 
plastid transformation is routine. The bottleneck of 
applying plastid transformation in crop improvement is 
the lack of technology. In tobacco, chimeric clones with 
transformed plastids are readily identified by shoot 
regeneration (Svab et al., 1990). In Arabidopsis, clones 
with transformed plastids are identified by greening 
{Sikdar et al., 1998). We have shown here that FLARE -S 
is a suitable marker to select for transplastomes in 
embryogenic rice cells, which lack the visually 
identifiable tissue culture phenotypes exploited in 
tobacco and Arabidopsis. Data presented here are the 
first example for stable integration of foreign DNA into 
the rice plastid genome. These rice plants are 
heteroplastomic . Uniformly transformed rice plants will 
be obtained by further selection on streptomycin medium 
and screening the embryogenic cells for FLARE -S 
expression. Thus, the FLARE- S marker system will enable 
extension of plastid transformation to cereal crops. 



The utility of the new chimeric promoters 
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The a 70 - type plastid ribosomal RNA operon promoter, 
Prrn, is the strongest known plastid promoter expressed 
in all tissue types. The ultimate product of this 
5 promoter in the plastid is RNA not protein. Therefore, a 
series of chimeric promoters were constructed to 
facilitate protein accumulation from Prrn, using 
expression of the neomycin phosphotransferase (NPTII) 
enzyme as the reference protein. 

10 

1) The expression cassettes have distinct tissue- 
specific expression profiles. Some of the expression 
cassettes described here will facilitate relatively high 
levels of protein expression in all tissues, including 

15 leaves, roots and seeds. Other cassettes have different 
expression profiles: for example will facilitate 
moderate levels of protein accumulation in the leaves 
while lead to relatively high levels of protein 
accumulation in the roots. Accumulation of a protein at 

20 levels of 10% to 50% of total soluble protein is 

considered high-level protein expression; low-levels of 
protein expression would be in the range of ^0 . 1% total 
soluble cellular protein. 

25 2) Efficiency of the selectable marker gene 

depends on the rate at which the gene product 
accumulates during the early stage of transformation. 
Since initially present only in a few copies per cell, 
high levels of expression from a few copies will provide 

3 0 protection from toxic substances early on, facilitating 
efficient recovery of transformed lines. The expression 
cassettes will be useful to drive the expression of the 
genes conferring resistance to the antibiotics 
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streptomycin, spectinomycin and hygromycin, and the 
herbicides phosphinotrycin and glyphosate. In such 
applications addition of amino acids at the N-terminus 
is acceptable, as long as it does not interfere with the 
5 expression of the selectable marker genes. NPTII is such 
an enzyme. In cases like NPTII, an N-terminal fusion and 
thereby the mRNA "Downstream Box" sequences give an 
additional at least two to four- fold increase in protein 
levels. The -DB construct which relied on an Nhel site, 

10 and involved addition of one (N-terminal) amino acid of 
the source gene coding region is convenient, but is not 
necessary. When translational fusion is not feasible due 
to inactivation of proteins, seamless in- frame 
constructs may be created by PCR methods outlined in the 

15 application. 

3) A second major area on which application of 
the chimeric promoters is extremely useful is protein 
expression for pharmaceutical, industrial or agronomic 
20 purposes. The examples include, but are not restricted 
to, production of vaccines, healthcare products like 
human hemoglobin, industrial or household enzymes. 
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What is claimed is: 

5 1. A recombinant DNA construct for expressing 

at least one heterologous protein in the plastids of 
higher plants, said construct comprising a 5' regulatory 
region which includes a promoter element, a leader 
sequence and a downstream box element operably linked to 
10 a coding region of said at least one heterologous 
protein, said chimeric regulatory region enhancing 
translational efficiency of an mRNA molecule encoded by 
said DNA construct . 

15 2 . A vector comprising the DNA construct of 

claim 1 . 

3 . A recombinant DNA construct as claimed in 
claim 1, said 5' regulatory region being selected from 

20 the group consisting of PrnnLatpB+DBwt , SEQ ID NO:l, 

PrrnLatpB-DB, SEQ ID NO:2, PrrnLatpB+DBm , SEQ ID NO: 3, 
PrrnLclpP+DBwt, SEQ ID NO: 4, PrrnclpP-DB, SEQ ID NO : 5 , 
PrrnLrbcL+DBwt , SEQ ID NO: 6, PrrnLrbcL-DB , SEQ ID NO: 7, 
PrrnLrbcL+DBm, SEQ ID NO: 8, PrrnLpsbB+DBwt , SEQ ID NO: 9, 

25 PrrnLpsbB-DB, SEQ ID NO: 10, PrrnLpsbA+DBwt , SEQ ID NO: 

11, PrrnLpsbA-DB, SEQ ID NO: 12, PrrnLpsbA-DB (+GC) , SEQ 
ID NO: 13. 

30 4. A recombinant DNA construct as claimed in 

claim 1, said 5' regulatory region being selected from 
the group consisting of PrrnLT7glO+DB/Ec , SEQ ID NO: 14, 
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PrrnLT7glO+DB/pt, SEQ ID NO: 15, PrrnLT7glO -DB , SEQ ID 
NO: 15. 

5 . A vector comprising a DNA construct as 
5 claimed in claim 1. 

6. A DNA construct as claimed in claim 1, 
said downstream box element having a sequence selected 
from the group consisting of 

10 5 ' TCCAGTCACTAGCCCTGCCTTCGGCA 1 3 and 

5 1 CCCAGTCATGAATCACAAAGTGGTAA 1 3 . 

7. A DNA construct as claimed in claim 1, 
wherein said heterologous protein is expressed from a 

15 bar gene encoded by S. hydroscopicus said bar gene 
inserted into a plasmid selected from the group 
consisting of pK012, and pJEK3 , said pJEK3 having the 
sequence of SEQ ID NO: 18. 

20 8. A DNA construct as claimed in claim 1, 

wherein said heterologous protein is expressed from a 
synthetic bar encoding nucleic acid, said synthetic bar 
nucleic acid having selected from the group consisting 
of SEQ ID NO: 19 and SEQ ID NO: 20. 

25 

9. A DNA construct as claimed in claim 1, 
said at least one heterologous protein comprising a 
fusion protein. 

3 0 10. A DNA construct as claimed in claim 9, 

said fusion protein having a first and second coding 
region operably linked to said 5 1 regulatory region such 
that production of said fusion protein is regulated by 
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said 5 ' regulatory region, said first coding region 
encoding a selectable marker gene and said second coding 
region encoding a fluorescent molecule to facilitate 
visualization of transformed plant cells. 

5 

11. A vector comprising the DNA construct of 

claim 10. 



12. A DNA construct as claimed in claim 9, 
10 said fusion protein consisting of an aadA coding region 
operably linked to a green fluorescent protein coding 
region. 



13. A DNA construct as claimed in claim 10, 
15 said aadA coding region being operably linked to said 

green fluorescent protein coding region via a nucleic 
acid molecule encoding a peptide linker having a 
sequence selected from the group consisting of 
ELVEGKLELVEGLKVA and ELAVEGKLEVA . 

20 

14. A DNA construct as claimed in claim 10, 
said construct having a sequence selected from the group 
of SEQ ID NOS: 21-25 and 27. 



25 15. A plasmid for transforming the plastids of 

higher plants, said plasmid being selected from the 
group consisting of pHK30(B), pHK31(B), pHK60, 
pHK32(B), pHK33(B), pHK34 (A) , pHK35 (A) , pHK64 (A) , 
pHK36 (A) , pHK37 (A) , pHK3 8 (A) , pHK3 9 (A) , pHK4 0 (A) , 

30 pHK41(A), pHK42 (A) , pHK43 (A) , pMSK56, pMSK57, pMSK48, 

pMSK49, pMSK35, pMSK53 and pMSK54 . 
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16. A transgenic plant containing a plasmid 
as claimed in claim 15. 

17. A transgenic plant as claimed in claim 
15 , said plant being selected from the group consisting 
of monocots and dicots. 

18. A method for producing transplastomic 
monocots, comprising : 

a) obtaining embryogenic cells; 

b) exposing said cells to a heterologous DNA 
molecule under conditions whereby said DNA enters the 
plastids of said cells, said heterologous DNA molecule 
encoding at least one exogenous protein, said at least 
one exogenous protein encoding a selectable marker; 

c) applying a selection agent to said cells to 
facilitate sorting of untransformed plastids from 
transformed plastids, said cells containing transformed 
plastids surviving and dividing in the presence of said 
selection agent ; 

d) transferring said surviving cells to 
selective media to promote shoot regeneration and 
growth ; and 

e) rooting said shoots, thereby producing 
transplastomic monocot plants. 

19. A method as claimed in claim 18, wherein said 
heterologous DNA molecule is introduced into said plant 
cell via a process selected from the group consisting of 
biolistic bombardment, Agrobacterium- mediated 
transformation, microinjection and electroporation. 
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20. A method as claimed in claim 18, wherein 
protoplasts are obtained from said embryogenic cells and 
said heterologous DNA molecule is delivered to said 
protoplasts by exposure to polyethylene glycol. 

21. A method as claimed in claim 18, wherein said 
selection agent is selected from the group consisting of 
streptomycin, and paromomycin 

22 . A monocot transformed via the method of claim 

18. 

23. A transformed monocot plant as claimed in 
claim 22, said monocot plant being selected from the 
group consisting of maize, millet , sorghum, sugar cane, 
rice, wheat, barley, oat, rye, and turf grass. 

24. A method for producing transplastomic rice 
plants, said method comprising: 

a) obtaining embryogenic calli ; 

b) inducing proliferation of calli on 
modified CIM medium ; 

c) obtaining embryogenic cell 
suspensions of said proliferating calli in liquid AA 
medium ; 

d) bombarding said embryogenic cells 
with microprojectiles coated with plasmid DNA; 

e) tranf erring said bombarded cells to 
selective liquid AA medium ; 

f) transferring said cells surviving in 
AA medium to selective RRM regeneration medium for a 
time period sufficient for green shoots to appear; and 
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g) rooting said shoots in a selective MS 

salt medium. 

25. A method as claimed in claim 24, said plasmid 
DNA being selected from the group of plasmids consisting 
of pMSK35 and pMSK53, pMSK54 and pMSK49. 

26. A transplastomic rice plant produced by the 
method of claim 24 . 

27. A method for containing transgenes in 
transformed plants, comprising: 

a) determining the codon usage in said plant 
to be transformed and in microbes found in association 
with said plant; and 

b) genetically engineering said transgene 
sequence via the introduction of rare codons to abrogate 
expression of said transgene in said plant associated 
microbe . 

28. A method as claimed in claim 27, wherein said 
transgene is a bar gene and said rare codons are 
arginine encoding codons selected from the group 
consisting of AGA and AGG. 
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1 10 20 26 

pt ADB 3 1 -AGGUCAGUGAUCGGGACGGAAGCCGU-5 ■ 

1430 1416 

1 10 20 26 

Ec ADB 3 ' -GGGUC AGUACUUAGUGUUUCA CCAUU- 5 1 
1483 1469 
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3 ' -AGGUC AGUGAUCGGGACGGAA GCCGU-5 ' (16) 
•llll ••111 •! •!! • 

3 ' - AGGUC AGUGAUCGGGACGGAA GCCGU-5 1 {15) I 
atpB wild type | |]| j • II |«| •] i« T 

mRNA AUGAGAAUCAAUCCUACUACUUCUGGUUCUGGGGUUUCCACGCUUGAAAA 

i I • ll»l»»ll! !• II 

3 ' -AGGUC AGUGAUCGGGACGGA AGCCGU-5 ' (16) 
• • III • II !• • 

3 ■ -AGGUCAGUGAUCGGGACGGAAGCCGU-5 ' (13) 

3 ' -AGGUC AGUGAUCGGGACGGAA GCCGU- 5 ' ( 9 / 7 ) 
•I I II •! I • 

3 ' -AGGUC AGUGAUCGGGACGGAA GCCGU- 5 ' < 9 / 8 ) 
atpB mutant j ~ j ~ • | | | ^ - 

mRNA AUGAGAAUaAAcCCgACaACaagUGGaagUGGGGUgUCCACGgcuagc 

II I l«l M I II 

3 • -AGGUC AGUGAUCGGGACGGAA GCCGU-5 ' (11/9) 
•III! 1 • i** 

3 • -AGGUC AGUGAUCGGGACGGAA GCCGU- 5 ' (10/8) 



clpP wild type 



(14) 

-AGGU CAGUGAUCGGGACGGAA GCCGU-5 ' (14) ^ 

AnGCCUAUUGGUGUUCCAAAAGUCCCUUUCCGAAGUCCUGGAGAGGAAGA 
" II IWlii I • 

IAGUGAUCGGGACGGAA GCCGU- 5 • ( 13 ) 

II I** t* *l I II • 

3 ' -AGGU CAGUGAUCGGGACGGAA GCCGU-5 ' (13) 



-AGGU CAGUGAUCGGGACGGA AGCCGU-5 ' 

i •! n» .. :. 

I- AGGU CAGUGAUCGGGACGGAA GCCGU- 5 1 (13/26) 

ATTG UCACCA^-VACAGAGACUAAAGCAAGUGUUGGAUUCAAAGCUGGUGU 
III llll I • 1*11 
3 ' -AGGU CAGUGAUCGGGACGGA AGCCGU-5 * (13/26) 

i • *~i i i^i i i « 



-AGGUQ 



iAGCCGU-5' (11/26) 



3 • -AGGU CAGUGAUCGGGACGGAA GCCGU-5 * ( 10/5 ) 

i! »l • I in • 

i • -AGGU CAGUGAUCGGGACGGAA GCCGU- 5 1 ( 9 /7 ) 

AgGaguCCuCAgACAGAaACaAAAGCcucaGUaGGAUUCAAAgcuagc 
III II I III II 
3 1 -AGGU CAGUGAUCGGGACGGA AGCCGU- 5 ' (11/3) 

3 ' -AGGU CAGUGAUCGGGACGGAA GCCGU- 5 • ( 9 /5 ) 



pebB wild typo 



3--AGC 
II 

xrcGU-5 • 

Tl I t 

GUAUUUC CAPGG GUUUGCCUUGGUAUCGUGUUCAUACCGUUGUAUUGAAUGAUCCCGG 

Mil ••! |||«H • III 

(17) 

1*1 



1 -AGGUC AGUGAUCGGGACGGAA GCCGU- 5 ' 



pshA wild type 



CAUGA CUGCAAUUUUAGAGAGACGCGAAAGCGAAAGCCUAUGGGGUCGCUU 
I II* 
-AGGUQf " " " 



Figure 2A 



WO 00/07431 



4/49 



09/762105 

PCT/US99/17806 



T7gl0 mRNA AO^CUAGCAUGACUGGUGGACAGCAAAUGGGUCGCGGAUCCGGCUGCUA 

t MM! I ••••! f !! 

EC ADB 3 ' -GGGUC AGUACtXJAGUGUUUCA CCAUU-5 ' ( 15 ) 

T7gl0+DB/Ec mRNA ATJGGCaAGCAUGACUGGUGGACAGgcuagc 
i! Il» I • II II «l 

pt ADB 3 1 -AGGUC AGUGAT3CGGGACOGA/ .GCCGU-5 ' (13) 

T7gl0+DB/pt mRNA AjOGGCaAucacuagcccugccuuGgcuagc 

II llllllilllllllll I •! 

PL ADB 3 1 -AGGU CAGUGAUCGGGACGGAA GCCGU-5 ' (21) 

T7alO-DB mRNA ACAUASGgwagcauugaacaagauggauugcau 
pt ADB 3'-AGGUciGisiaiyii^GCCG5-5' (14) 
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PrrnliatpB+DBwt (pHKlO) 

Sad 

1 aaactc GCTC CCCCGCCGTC GTTCAATGAG AATGGATAAG AGGCTCGTGG 

• 

51 GATTGACGTG AGGGGGCAGG GATGGCTATA TTTCTGGGAG AATTAACCGA 

101 TCGACGTGCa AGC GGAC ATT TATTTTaAAT TCGATAATTT TTGCAAAAAC 

151 ATTTCGACAT ATTTATTTAT TTTATTATTA TGAGAATCAA TCCTACTACT 
Nhel 

201 TCTGGTTCTG GGGTTTCCAC Ggctagc 

PrrnLatpB-DB (pHKll) 

Sacl 

1 aaactc GCTC CCCCGCCGTC GTTCAATGAG AATGGATAAG AGGCTCGTGG 

• 

51 GATTGACGTG AGGGGGCAGG GATGGCTATA TTTCTGGGAG AATTAACCGA 

101 TCGACGTGCa AGCGGACATT TATTTTaAAT TCGATAATTT TTGCAAAAAC 

Nhel 

151 ATTTCGACAT ATTTATTTAT TTTATTATTA TGAGAgc tag c 

PrmUtpB+DBm (pHKSO) 

Sacl 

1 aaactc GCTC CCCCGCCGTC GTTCAATGAG AATGGATAA G AGGCTCGTGG 

• 

51 GATTGACGTG AGGGGGCAGG GATGGCTATA TTTCTGGGAG AATTAACCGA 

101 TCGACGTGCa AGCGGACATT TATTTTaAAT TCGATAATTT TTGCAAAAAC 

151 ATTTCGACAT ATTTATTTAT TTTATTATTA TGAGAATaAA cCCgACaACa 
Nhel 

201 - agTGGaagTG GGGTgTCCAC Ggctagc 

PrroZiclpP+DBwt (pHK12) 

Sacl • 
1 aaactc GCTC CCCCGCCGTC GTTCAATGAG AATGGATAAG AGGCTCGTG G 

51 GATTGACGTG AGGGGGCAGG GATGGCTATA TTTCTGGGAG TTACGTTTCC 

101 ACCTCAAAGT GAAATATAGT ATTTAGTTCT TTCTTTCATT TAATGCCTAT 

Nhel 

151 TGGTGTTCCA AAAGTCCCTT TCCGAAGTCC TGGAGAGGAA get age 

PrraLclpE>-DB (pHKl3) 

Sacl « 
1 aaactc GCTC CCCCGCCGTC GTTCAATGAG AATGGATAAG AGGCTCGTGG 

51 GATTGACGTG AGGGGGCAGG GATGG CTATA TTTCTGGGAG TTACGTTTCC 

Nhel 

101 ACCTCAAAGT GAAATATAGT ATTTAGTTCT TTCTTTCATT TAATGCCTgc 
151 tagc 
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PrrnlirbcL+DBwt (pHK14) 

SacI 

1 qaqctc GCTC CCCCGCCGTC GTTCAATGAG AATGGATAAG AGGCTCGTGG 

51 GATTGACGTG AGGGGGCAGG GATGGCTATA TTTCTGGGA G TCGAGTAGAC 

101 CTTGTTGTTG TGAaAATTCT TAATTCATGA GTTGTAGGGA .GGGATTTATG 

Nhel 

151 TCACCACAAA CAGAGACTAA AGCAAGTGTT GGATTCAAAg ctagc 

PrraLrbcL-DB (pHK15) 

SacI 

1 qaqctc GCTC CCCCGCCGTC GTTCAATGAG AATGGATAAG AGGCTCGTGG 

51 GATTGACGTG AGGGGGCAGG GATGGCTATA TTTCTGGGA G TCGAGTAGAC 

101 CTTGTTGTTG TGAaAATTCT TAATTCATGA GTTGTAGGGAGGGATTTATG 

Nhel 
151 TCAgctagc 

Prrnlix-bcL+DBra (pHK54) 

SacI 

1 gagctc GCTC CCCCGCCGTC GTTCAATGAG AATGGATAAG AGGCTCGTGG 

• 

51 GATTGACGTG AGGGGGCAGG GATGGCTATA TTTCTGGGAG TCGAGTAGAC 

101 CTTbTTGTTG TGAaAATTCT TAATTCATGA GTTGTAGGGA _ GGGATTTATG 

Nhel 

151 aguCCuCAgA CAGAaACaAA AGCcucaGTa GGATTCAAAg ctagc 

PrraLpsbB+Dfcwt <pHK16) 

SacI » 
1 qaqctc GCTC CCCCGCCGTC GTTCAATGAG AATGGATAAG AGGCTCGTGG 

51 GATTGACGTG AGGGGGCAGG GATGGCTATA TTTCTGGGA G CAATGCAATA 

101 AAGTTACGTA GTGTCTATTT ATCTTTGATA TAAGGGGTAT TTCCATGGGT 

Nhel 

151 TTGCCTTGGT ATCGTGTTCA TACCGTTGTA TTGAATGATg ctagc 

PrrnLp sbB-DB (pHK17) 

SacI » 
1 qaqctc GCTC CCCCGCCGTC GTTCAATGAG AATGGATAAG AGGCTCGTGG 

51 GATTGACGTG AGGGGGCAGG GATGGCTATA TTTCTGGGA G CAATGCAATA 

Ncol Nhel 

101 AAGTTACGTA GTGTCTATTT ATCTTTGATA TAAGGGGTAT TTccatggct 
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PrrnLpsbA+DBwt (pHK21> 

SacI 

1 gagctc GCTC CCCCGCCGTC GTTCAATGAG AATGGATAAG AGGCTCGTGG 

51 GATTGACG TG AGGGGGCAGG GATGGCTATA TTTCTGGGAA AAAAGCCTTC 

101 CATTTTCTAT TTTGATTTGT AGAAAACTAG TGTGCTTGGG^AGTCCCTGAT 

Nhel 

151 GATTAAATAA ACCA3GATTT TACCATGACT GCAATTTTAG AGAGAgctag 



PrraLpsbA-DB (pHK22) 

Sac I 

1 gagctc GCTC CCCCGCCGTC GTTCAATGAG AATGGATAAG AGGCTCGTGG 

51 GATTGACGTG AGGGGGCAGG GATGGCTATA TTTCTGGGAA AAAAGCCTTC 

101 CATTTTCTAT TTTGATTTGT AGAAAACTAG TGTGCTTGGG AGTCCCTGAT 
Ncol Nhel 

151 GATTAAATAA ACCAAGATTT TAccatggct age 

PrrnLpsbA- DB ( +GC ) (pHK23) 

SacI 

1 gagctcG CTC CCCCGCCGTC GTTCAATGAG AATGGATAAG AGGCTCGTGG 

51 GATTGACGTG AGGGGGCAGG GATGGCTATA TTTCTGGGAG CAAAAAGCCT 

101 TCCATTTTCT ATTTTGATTT GTAGAAAACT AGTGTGCTTG GGAGTCCCTG 
Ncol Nhel 

151 ATGATTAAAT AAACCAAGAT TTTAccatgg ctagc 
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J?rraLT7glO+DB/Ec (pHK18) 

SacI 

1 qaqctc GCTC CCCCGCCGTC GTTCAATGAG AATGGATAAG AGGCTCGTGG 

• 

51 GATTGACGTG AGGGGGCAGG GATGGCTATA TTTCTGGGA G GGAGACCACA 

101 ACGGTTTCCC aCTAGAAATA ATTTTGTTTA ACTTTAAGAA GGAGATATAC 
Nhel 

151 ATATGGCaAG CATGACTGGT GGACAGgcta gc 

I>rraLT7glO+DB/pfc (pHK19) 

Sac I 

1 qaactc GCTC CCCCGCCGTC GTTCAATGAG AATGGATAAG AGGCTCGTGG 

• 

51 GATTGACGTG AGGGGGCAGG GATGGCTATA TTTCTGGGA G GGAGACCACA 

101 ACGGTTTCCC aCTAGAAATA ATTTTGTTTA ACTTTAAGAA GGAGATATAC 
Nhel 

151 ATATGGCaAt cactagccct gccttGgcta gc 

PrnHiT7glO-DB (pHK20) 

3rd 

1 aaoctcr- CTC CCCCGCCGTC GTTCAATGAG AATGGATAAG AGGCTCGTGG 

• 

51 GATTGACGTG AGGGGGCAGG GATGGCTATA TTTCTGGGA G GGAGACCACA 

101 ACGGTTTCCC aCTAGAAATA ATTTTGTTTA ACTTTAAGAA GGAGATATAC 
Nhel 

151 ATATGgc tag c 
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Sac I 

1 gagctcggta cccaaaGCTC CCCCGCCGTC GTTCAATGAG AATGGATAAG 

51 AGGCTCGTGG GATTGACGTG AGGGGGCAGG GATGGCTATA TTTCTGGGAG 

Ncol 

101 CGAACTCCGG GCGAATAcGA AGCGCtTGGA TACAGTTGTA GGGAGGGATc 
Nhel 

151 catg.gctagc ATTGAACAAG ATGGATTGCA CGCAGGTTCT CCGGCCGCTT 

201 GGGTGGAGAG GCTATTCGGC TATGACTGGG CACAACAGAC AATCGGCTGC 

251 TCTGATGCCG CCGTGTTCCG GCTGTCAGCG CAGGGGCGCC CGGTTCTTTT 

301 TGTCAAGACC GACCTGTCCG GTGCCCTGAA TGAACTCCAG GACGAGGCAG 

351 CGCGGCTATC GTGGCTGGCC ACGACGGGCG TTCCTTGCGC AGCTGTGCTC 

401 GACGTTGTCA CTGAAGCGGG AAGGGACTGG CTGCTATTGG GCGAAGTGCC 

451 GGGGCAGGAT CTCCTGTCAT CTCACCTTGC TCCTGCCGAG AAAGTATCCA 

501 TCATGGCTGA TGCAATGCGG CGGCTGCATA CGCTTGATCC GGCTACCTGC 

551 CCATTCGACC ACCAAGCGAA ACATCGCATC GAGCGAGCAC GTACTCGGAT 

601 GGAAGCCGGT CTTGTCGATC AGGATGATCT GGACGAAGAG CATCAGGGGC 

651 TCGCGCCAGC CGAACTGTTC GCCAGGCTCA AGGCGCGCAT GCCCGACGGC 

701 GAGGATCTCG TCGTGACACA TGGCGATGCC TGCTTGCCGA ATATCATGGT 

751 GGAAAATGGC CGCTTTTCTG GATTCATCGA CTGTGGCCGG CTGGGTGTGG 

801 CGGACCGCTA TCAGGACATA GCGTTGGCTA CCCGTGATAT TGCTGAAGAG 

851 CTTGGCGGCG AATGGGCTGA CCGCTTCCTC GTGCTTTACG GTATCGCCGC 

901 TCCCGATTCG CAGCGCATCG CCTTCTATCG CCTTCTTGAC GAGTTCTTCT 
Xbal 

951 GAacgga tct aaaa tAGACA TTAGCAGATA AATTAGCAGG AAATAAAGAA 

1001 GGATAAGGAG AAAGAACTCA AGTAATTATC CTTCGTTCTC TTAATTGAAT 

1051 TGCAATTAAA CTCGGCCCAA TCTTTTACTA AAAGGATTGA GCCGAATACA 

1101 ACAAAGATTC TATTGCATAT ATTTTGACTA AGTATATACT TACCTAGATA 

Hindi I I 

1151 TACAAGATTT GAAATACAAA ATCTAGcaag ctt 
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Kcol 

CCATGg_caccacaaacagagAGCCCAGAACGACGCCCGGCCGACATCCGCCGTGCCACCG 

+ + + + + + 50 

GGTACcgtggtgtttgtctcTCGGGTCTTGCTGCGGGCCGGCTGTAGGCGGCACGGTGGC 
MAPQTESPERRPADIRRATE 



AGGCGGACATGCCGGCGGTCTGCACCATCGTCAACCACTACATCGAGACAAGCACGGTCA 

+ + + + + + 12 o 

TCCGCCTGTACGGCCGCCAGACGTGGTAGCAGTTGGTGATGTAGCTCTGTTCGTGCCAGT 
ADMPAVCTIVNHYIETSTVTS1 



ACTTCCGTACCGAGCCGCAGGAACCGCAGGAGTGGACGGACGACCTCGTCCGTCTGCGGG 

+ t* + + + + 180 

TGAAGGCATGGCTCGGCGTCCTTGGCGTCCTCACCTGCCTGCTGGAGCAGGCAGACGCCC 
FRTEPQEPQEWTDDLVRLRE 

AGCGCTATCCCTGGCTCGTCGCCGAGGTGGACGGCGAGGTCGCCGGCATCGCCTACGCGG 

+ + + + + + 240 

TCGCGATAGGGACCGAGCAGCGGCTCCACCTGCCGCTCCAGCGGCCGTAGCGGATGCGCC 
RYPWLVAEVDGEVAGIAYAG 



GCCCCTGGAAGGCACGCAACGCCTACGACTGGACGGCCGAGTCGACCGTGTACGTCTCCC 
+ + + + +. + 300 

CGGGGACCTTCCGTGCGTTGCGGATGCTGACCTGCCGGCTCAGCTGGCACATGCAGAGGG 
PWKARNAYDWTAESTVYVSP 



CCCGCCACCAGCGGACGGGACTGGGCTCCACGCTCTACACCCACCTGCTGAAGTCCCTGG 
+ + + + + + 350 

GGGCGGTGGTCGCCTGCCCTGACCCGAGGTGCGAGATGTGGGTGGACGACTTCAGGGACC 
RHQRTGLGSTLYTHLLKSLE 

AGGCACAGGGCTTCAAGAGCGTGGTCGCTGTCATCGGGCTGCCCAACGACCCGAGCGTGC 
+ + + + + + 4 2o 

TCCGTGTCCCGAAGTTCTCGCACCAGCGACAGTAGCCCGACGGGTTGCTGGGCTCGCACG 
AQGFKSVVAVIGLPNDPSVR 

GCATGCACGAGGCGCTCGGATATGCCCCCCGCGGCATGCTGCGGGCGGCCGGCTTCAAGC 
+. + + + + + 480 

CGTACGTGCTCCGCGAGCCTATACGGGGGGCGCCGTACGACGCCCGCCGGCCGAAGTTCG 
MHEALGYAPRGMLRAAG FKH 

ACGGGAACTGGCATGACGTGGGTTTCTGGCAGCT GGACTTCAGCCTGCCGGTACCGCCCC 
+ + + + + + 540 

TGCCCTTGACCGTACTGCACCCAAAGACCGTCGACCTGAAGTCGGACGGCCATGGCGGGG 
GNWHDVGFWQLDFSLPVPPR 

BglXI 

GTCCGGTCCTGCCCGTCACCGAGATCTGATGAtcgaattcctgcagcccgggggatccac 
+ + + + + + 500 

CAGGCCAGGACGGGCAGTGGCTCTAGACTACTagcttaaggacgtcgggccccctaggtg 
PVLPVTEI* 
Xbal 
tagtcocaga 

««^«t 610 Figure 19 
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NcoX NheX 

CcATGgctAGCC CAGAAaGAaGaCCGGCCGAtATtaGaCGTGCtACaGAaGCtGAtATGC 



ggTACcgaTCGGGTCTTtCTtCtGGCCGGCTaTAatCtGCACGaTGtCTtCGaCTaTACG 
MASPERRPADIRRATEADMP 

CaGCaGTtTGtACaATtGTtAAtCAtTAtATaGAaACAAGtACcGTaAACTTtcGaACtG 



GtCGtCAaACaTGtTAaCAaTTaGTaATaTAtCTtTGTTCaTGgCAtTTGAAagCiiTGaC 
AVCTIVNHYIETSTVNFRTE 

AaCCtCAaGAACCtCAaGAaTGGACtGAtGAttTaGTCCGTtTaCGaGAGCGCTATCCtT 



TtGGaGTtCTTGGaGTtCTtACCTGaCTaCTaaAtCAGGCAaAtGCtCTCGCGATAGGaA 
PQEPQEWTDDLVRLRERYPW 

GGCTtGTaGCaGAaGTtGACGGaGAaGTaGCtGGgATtGCaTAtGCGGGCCCgTGGAAaG 



CCGAaCAtCGtCTtCAaCTGCCtCTtCAtCGaCCcTAaCGtATaCGCCCGGGcACCTTtC 
LVAEV DGEVAG IAYAGPWKA 

CAcGaAAtGCaTAtGAtTGGACgGCtGAaTCaACtGTgTACGTtTCaCCaCGtCAtCAaC 

GTgCtTTaCGtATaCTaACCTGcCGaCTtAGtTGaCAcATGCAaAGtGGtGCaGTaGTtG 
RNAYDWTAESTVYVSPRHQR 

GgACaGGACTtGGtTCtACttTaTAtACcCAtCTaCTGAAaTCttTGGAGGCACAgGGtT 



CcTGtCCTGAaCCaAGaTGaaAtATaTGgGTaGAtGACTTtAGaaACCTCCGTGTcCCaA 
TGLGS TLYTHLLKSLEAQGF 

TtAAGAGtGTgGTaGCTGTtATaGGatTGCCgAAtGAtCCctcgGTaCGCATGCAcGAaG 



AaTTCTCaCAcCAtCGACAaTAtCCtaACGGcTTaCTaGGgagcCAtGCGTACGTgCTtC 
KSVVAVIGLPNDPSVRMHEA 

CtCTcGGATATGCtCCcaGaGGtATGtTGaGGGCcGCaGGtTTCAAaCAtGGaAAtTGGC 



GaGAgCCTATACGaGGgtCtCCaTACaACtCCCGgCGtCCaAAGTTtGTaCCtTTaACCG 
LGYAPRGMLRAAGFKHGNWH 

ATGAtGTaGGTTTtTGGCAaCTtGAcTTCtcttTaCCaGTACCtCCtCGTCCcGTttTaC 



TACTaCAtCCAAAaACCGTtGAaCTgAAGagaaAtGGtCATGGaGGaGCAGGgCAaaAtG 
DVGFWQLDFSLPVPPRPVLP 

SglXX Xbal 

CcGTtACtG AGATCT GATGA tctaga 



GgCAaTGaCTCTAGACTACTagatcc 
V T E I * * 
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SicoX Hh&X 

ccaTGqcrAGC CCAGAfl.aGAaGaCCGGCCGAtArggtGaCGTGCtJlCaGAaGCtGAtATGC 



ggTACcgaTCGGGTCTTtCT-C^GGCCGGCTaTAatCtGCACGaTGrCTtCGaCTaTACG 
MASPERRPADIRRATEADM? 



CaGCaGTtTGtACaATtGTtAAtCAtTAtATaGAaACZiAGtACaGTaAAtTTtcGaACtG 

, t- -f. > , * . J. + + 

GtCGtCAaACaTGtTAaCAaTTaGTaATaTAtCTtTGTTCaTGtCAtTTaAAfegC-tTGaC 
AVCTIVNHYIETSTVNFRTE 

AaCCtCAaGAACCtCAaGAaTGGACtGAcGAtrTaGTaCGTtTaCGaGAaCGtTATCCfcT 

+ + +-— + + — - — ~ -t- 

TtGGaGTtCTTGGaGTtCTtACCTGaCTaCTaaAtCAtGGAaAtGCtCTtGCaATAGGaA 
POEPQEWTDDLVRLRERYPW 

GGCTtGTaGCaGAaGTtGAcGGaGAaGTaGCtGGaATtGCaTAtGCtGGtCCgTGGAAaG 
— , + + + — + + 

CCGAaCAtCGtCTtCAaCTgCCtCTtCAtCGaCCtTAaCGtATaCGaCCaGGcACCTTtC 
LVAEVDGEVAG I AYAG PWKA 

CAcGaAAtGCaTAtGAtTGGACaGCtGAaTCaACtGTtTAtGTtTCaCCaCGcCAtCAaC 
+ ' + __ __, — j. + -i + 

GTgCtTTaCGtATaCTaACGTGtCGaCTtAGtTGaCAaATaCAaAGtGGtGCaGraGTcG 
RNAYDWTAESTVYVSERHQR 

GtACaGGACTtGGtTCtACttTaTAtACtCAtCTtCTtAAaTC-CtTGGAaGCACAaGGtT 

— — — — r ■¥ + >4— 4- 

CaTGtCCTGAaCCaAGaTGaaAtATaTGaGTaGAaGAaTTtAGaaACCTtCGTGTtCCaA 
TGLGSTLYTHLLKSLEAQGF 

TtAAaAGtGTaGTaGCTGTtATaGGatTGCCgAAtGAtCCctcaGTaCGCATGCArGAaG 
— ->- -f n > + «. + -u 

AaTTtTCaCAtCAtCGACAaTAtCCtaACGGcTTaCTaGGgagtCAtGCGTACGTaCTtC 
KSVVAVIGLFNDPSVRMHEA 
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+ "T ■ + -r + + 

GaGAaCCTATACGaGGg-cCcCCaTACaACtCCCGtCGtCCaAAGTTtGTaCCtTTaACCG 
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ATGAtGTaGGTTTtTGGCAaCTtGAeTTCtCCtTaCCaGTACCtCCtCGTCCcGTfctTaC 
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FLARE1 6-S . seq Length : 

H3£t 



1574 



tfGgGGgc 

51 JGAGGTAGTTG 

101 ACATTTGTAC 

151 TTGATTTGCT 

201 GCTTTGATCA 

251 GATTCTCCGC 

301 CGTGGCGTTA 

351 AATGACATTC 

401 GGCTATCTTG 

451 CAGCGGCGGA 

501 5CGCTAAATG 

551 CGATGAGCGA 

€01 rAACCGGCAA 

651 2GCCTGCCGG 

701 rCTTGGACAA 

751 feATTTGTCCA 



€01 ,ctt?gttgaag 

851 ItAGTAAAGGA 

901 rAGATGGTGA 

951 SGTGATGCAA 

1001 &AAACTACCT 

1051 rTCAATGCTT 

1101 &AGAGCGCCA 

1151 3GACGACGGG 

1201 XCTCGTCAA 

1251 5ACATCCTCG 

1301 ^ATCACGGCA 

1351 3ACACAACAT 

1401 ^ATACTCCAA 

1451 3TCCACACAA 

1501 PGGTCCTTCT 



tagcGAAGCG 
GCGTCATCGA 
GGCTCCSCAG 
GGTTACGGTG 
ACGACCTTTT 
GCTGTAGAAG 
TCCAGCTAAG 
TTGCAGGTAT 
CTGACAAAAG 
GGAACTCTTT 
AAACCTTAAC 
AATGTAGTGC 
AATCGCGCCG 
CCCAGTATCA 
GAAGAAGATC 
CTACGTGAAA 



GTGATCGCCG 
GCGCCATCTC 
TGGATGGCGG 
ACCGTAACsC 
GGAAACTTCG 
TCACCATTGT 
CGCGAACTGC 
CTTCGAGCCA 
CAAGAGAACA 
GATCCGGTTC 
GCTATGGAAC 
TTACGTTGTC 
AAGGATGTCG 
GCCCGTCATA 
GCTTGGCCTC 
GGCGAGATCA 



AAGTATCGAC 
GAACCGACGT 
CCTGAAGCCA 
TTGATGAAAC 
GCi'TCCCCTG 
TGTGCACGAC 
AATXTGGAGA 
GCCACGATCG 
tAGCGTTGCC 
CTGAACAGGA 
TCGCCGCCCG 
CCGCATTTGG 
CTGCCGACTG 
CTTGAAGCTA 
GCGCGCAGAT 
CGAAGGTAGT 



TCAACTATCA 

TGCTGGCCGT 

CACAGTGATA 

AACGCGGCGA 

GAGAGAGCGA 

GACATCATTC 

ATGGCAGCGC 

ACATTGATCT 

TTGGTAGGTC 

TCTATTTGAG 

ACTGGGCTGG 

TACAGCGCAG 

GGCAATGGAG 

GACAGGCTTA 

CAGTTG GAAG . 

qGGCAAfibaa 



gaaaattgga gctagtagaa ggtctrtaaag tcgctff VTGgc 
GAAGAA'ctTT "TCACTGGAST'TGl'dddAAYT OTTGTTGAAT 



TGTTAATGGG CACAAATTTT 
CATACGGAAA ACTTACCCTT 
GTTCCtTGGC CAACACTTGT 
WCAAGATAC CCAGATCATA 
TGCCTGAGGS ATACGTGCAG 
AACTACAAGA CACGTGCTGA 
CAGGATCGAG CTTAAGGGAA 
GCCACAAGTT GGAATACAAC 
GACAAACAAA AGAATGGAAT 
.TGAAGATGGA AGCGTTCAAC " 
TTGGCGATGG CCCTGTCCTT 
TCTGCCCTTT CGAAAGATCC 
TGAGT TTGTA ACAGCTGCTG 



15S1 5ARCTATACA AATAAG oe te tara 



CTGTCAGTGG 
AAATTTATTT 
CACTACTTTC 
TGAAGCGGCA 
GAGAGGACCA 
AGTCAAGTTT 
TCGATTTCAA 
TACAACTCCC 
CAAAGCTAAC 
TAGfcAGACCA 
TTACCAGACA 
CAACGAAAAG 
GGATTACACA 



AGAGGGTGAA 
GCACTACTGG 
TCTTATGGTG 
CGACTTCTTC 
TCTCTTTCAA 
GAGGGAGACA 
GGAGGACGGA 
ACAACGTATA 
TTCAAAATTA 
TTATCAACAA 
ACCATTACCT 
AGAGACCACA 
TGGCATGGAT 



Figure 28 
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,seq Length: 

S<uX t— 
gagCtcGCTC 
GATTGACGTG 
CTTGTTGTTG 



ftGCGCCATCT 
GTGGATGGCG 
GACCGTAAGG 
TGGAAACTTC 
GTCACCATTG 
GCGCGAACTG 
TCTTCGAGCC 
2CAAGAGAAC 
TGATCCGGTT 
2GCTATGGAA 
2TTACGTTGT 
SAAGGATGTC 
kGCCCGTCAT 

:gcttggcct 
^ggcgagatc 



19S3 

CCCCGCCGTC 
AGGGGGCAGG 
TGAaAATTCT 
CftGflSftgTAA 



GAAGTATCGA 
CGAACCGACG 
GCCTGAAGCC 
CTTGATGAAA 
GGCTTCCCCT 
TTGTGCACGA 
CAATTTGGAG 
AGCCACGATC 
ATAGCGTTGC 
CCTGAACAGG 
CTCGCCGCCC 
CCCGCATTTG 
GCTGCCGACT 
ACTTGAAGCT 
CGCGCGCAGA 
ACCAAGGTAG 



GTTCAATGAG 
GATGGCTATA 
TAATTCATGA 
aSCaftfiTSTX. 



CTCAACTATC 
TTGCTGGCCG 
ACACAGTGAT 
CAACGCGGCG 
GGAGAGAGCG 
CGACATCATT 
AATGGCAGCG 
GAGATTGATC 
CTTGGTAGGT 
ArCTATTTGA 
GACTGGGCTG 
GTACAGCGCA 
GGGCAATGGA 
AGACAGGCTT 



AATGGATAAG 
TTTCTGGGAG 
GTTGTA< 
GGATTi 



AGGCTCGTGG 
TCGAGTAGAC 
GGGATTTATG 



AGAGGTAGTT 
TACATTTGTA 
ATTGATTTGC 
AGCTTTGATC 
AGATTCTCCG 
CCGTGGCGTT 
CAATGACATT 
TGGCTATCTT 
CGAGCGGCGG 
GGCGCTAAAT 
GCGATGAGCG 
GTAACCGGCA 
GCGCCTGCCG 
ATCTTGGACA 
GA^^GTCC, 



ctagcE AAGC 



agctagtaga aggtct xaaa gtcgccfSTEg 



XTCACTGGAG 



GCACAAATTT 
AACTTACCCT 
CCAACACTTG 
CCCAGATCAT 
GATACGTGCA 
ACACGTGCTG 
SCTTAAGGGA 
rGGAATACAA 
5AGAATGGAA 
&AGCGTTCAA 
3CCCTGTCCT 
rCGAAAGATC 



TTGTCCCAAT 
TCTGTCAGTG 
TAAATTTATT 
TCACTACTTT 
ATGAAGCGGC 
GGAGAGGACC 
AAGTCAAGTT 
ATCGATTTCA 
CTACAACTCC 
TCAAAGCTAA 
CTAGCAGACC 
TTTACCAGAC 
CCAACGAAAA 



GAGAGGGTGA 
TGCACTACTG 
CTCTTATGGT 
ACGACTTCTT 
ATCTCTTTCA 
TGAGGGAGAO 
AGGAGGACGG 
CACAACGTAT 
CTTCAAAATT 
ATTATCAACA 
AACCATTACC 
GAGAGACCAC 



acttgttgaa 
ctAG ' i'AAAGU " 
TTAGATGGTG 
AGGTGATGCA 
GAAAACTACC 
GTTCAATGCT 
CAAGAGCGCC 
AGGAGGACGG 
ACCCTCGTCA 
AAACATCCTC 
ACATCACGGC 
AGACACAACA 
AAATACTCCA 
TGTCCACACA 
ATGGTCCTTC 



GGCGTCATCG 
CGGCTCCGCA 
TGGTTACGGT 
AACGACCTTT 
CGCTGTAGAA 
ATCCAGCTAA 
CTTGCAGGTA 
GCTGACAAAA 
AGGAACTCTT 
GAAACCTTAA 
AAATGTAGTG 
AAATCGCGCC 
GCCCAGTATC 
AGAAGAAGAT 
ACTACGTGAA. 



ggaaaattgg 
AGAAGAAC' JT' 
ATGTTAAT.GG 
ACATACGGAA 
TGTTCCtTGG 
TTTCAAGATA 
ATGCCTGAGG 
GAACTACAAG 
ACAGGATCGA 
GGCCACAAGT 
AGACAAACAA 
TTGAAGATGG 
ATTGGCGATG 
ATCTGCCCTT. 



^ACAGCTGCT. GGGATTACAC ATGGCATGGA TGAACTATAC AARTAAGqct 



ctagagcjg ST 
&TAATCATTT 
ITCTTTTTAT 
CATTATAGAA 



TCTTGTTCTA 
TTATTTACTA 
AAAGAAGGAG 



'CTATAGGAG 
TCAAGAGGGT 
GTATTTTACT 
AGGTTATTTT 



5TTTTGAAM ' 
GCTATTGCTC 
TACATAGACT 
CTTGCATTTA 



AAATAAGgct 
iiAAASGAUCAj 
CTTTCTTTTTj ^ 



CTTTCTTTTTj ^ 
TTTTT GTTTAl ^ 
TTCATG> aag N 
IfTWlB. 



Figure 29 
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FUVR£16-S2.seq Length; 1985 



GATTGACGTG 
TCGACGTGCa 
ATTTCGACAT 



CCCCGCCGTC 
AGGGGGCAGG 
AGCGGACATT 
ATTTATTTAT 
SSgrTECCSfi. 



GTTCAATGAG AATGGATAAG 
GATGGCTATA TTTCTGGGAG 
TATTTTaAAT TCGATAATTT 
TTTATTATTA TGAGAATCAA 



GAC'TCAACTA 
CGTTGCTGGC 
CCACACAGTG 
AACAACGCGG 
CTGGAGAGAG 
SACGACATCA 
&GAATGGCAG 
TCGACATTGA 
SCCTTGGTAG 
SGATCTATTT 
"CGACTGGGC 
TGSTACAGCG 
CTGGGCAATG 
CTAGACAGGC 
SATCAGTTGG 

jV GJaGGCAAa gaacttgttg 
aagtcgccft T GgcrAGTAAA 

i._„ '„„ AATTAGATGG 

GAAGGTGATG 
TGGAAAACTA 
GTGTTCAATG 
TTCAAGAGCG 
.CAAGGACGAC 
ACACCCTCGT 
GGAAACATCC 
AIACATCACG 
TTAGACACAA 
CAAAATACTC 
CCTGTCCACA 
ACATGGTCCT 



TCAGAGGTAG 
CGTACATTTG" 
ATATTGATTT 
CGAGCTTTGA 
CGAGATTCTC 
TTCCGTGGCG 
CGCAATGACA 
TCTGGCTATC 
GTCCAGCGGC 
GAGGCGCTAA 
TGGCGATGAG 
CAGTAACCGG 
GAGCGCCTGC 
TTATCl' x GSA 
AAGAATTTGT 



TACGGCTCCG 
GCTGGTTACG 
TCAACGACCT 
CGCGCTGTAG 
TTATCCAGCT 
TTCTTGCAGG 
TTGCTGACAA 
GGAGGAACTC 
ATGAAACCTT 
CGAAATGTAG 
CAAAATCGCG 
CGGCCCAGTA 
CAAGAAGAAG 
CCACTACGTG 



CGAGCGCCAT 
CAGTGGATGG 
GTGACCGTAA 
TTTGGAAACT 
AAGTCACCAT 
AAGCGCGAAC 
TATCTTCGAG 
AAGCAAGAGA 
TTTGATCCGG 
AACGCTATGG 
TGCTTACGTT 
CCGAAGGATG 
TCAGCCCGTC 
ATCGCTTGGC 
AAAGGCGAGA 



AGGCTCGTGG 
AATTAACCGA 
TTGCAAAAAC 
TCCTACTACT ) . 

:GAA@i^H ^ 
CTCGAACCGA 
CGGCCTGAAG 
GGCTTGATGA 
TCGGCTTCCC 
TGTTGTGCAC 
TGCAATTTGG 
CCAGCCACGA 
ACATAGCGTT 
TTCCTGAACA , 
AACTCGCCGC ; 
GTCCCGCATT 
TCGCTGCCGA 
ATACTTGAAG 
CTCGCGCGCA 
TCACCAAGGT, 



aaggaaaatt ggagctagta gaaggtct-ta 
GGAGAAGAAC 



&.TTCTTGTTG 
TGGAGAGGGT 
rTTGCACTAC 
rTCTCTTATG 
3CACGACTTC 
^CATCTCTTT 
ITTGAGGGAG 
IAAGGAGGAC 
XCACAACGT 
&ACTTCAAAA 
-CATTATCAA 
iCAACCATTA. 
QAGAGAGACC 



TATCAAGAGG 
I&GIATTTTA 
afiftSSTTflTT 



TGATGTTAAT 
CAACATACGG 
CCTGTTCCtT 
CTTTTCAAGA 
CCATGCCTGA 
GGGAACTACA ; 
CAACAGGATC 
TCGGCCACAA 
GCAGACAAAC 
CATTGAAGAT 
CAATTGGCGA 
CAATCTGCCC 
TCTTGAG! 
£CAAA£ 



TTTTCACTGG 
GGGCACAAA7 
AAAACTTACC 
GGCCAACACT 
TACCCAGATC 
GGGATACGTG 
AGSCACGTGC 
GAGCTTAAGG 
GTTGGAATAC 
AAAAGAATGG 
GGAAGCGTTC 
TGGCCCTGTC 
TTTCGAAAGA 



AGTTGTCCCA" 
TTTCTGT CAG 
CTTAAATTTA 
TGTCACTACT 
ATATGAAGCG 
CAGGAGAGGA 
TGAAGTCAAG 
GAATCGATTT 
AACTACAACT 
AATCAAAGCT 
AACTAGCAGA 
CTTTTACCAG 
TCCCAACGAA 



AAGAAAGGAG 



AGGWttGAA J 

GXGCTATTGC TCCTTTCTTT TTTTCTTTTT J 
CTTACATAGA i 
TTCTTGCATT 1 




TATTCATGfra agctt 



Figure 30 
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FLAREll-S.seq Length: 1595 



ccatgggggc 



51 SAAGCGGTGA 

101 CaTCGAGCGC 

151 CCGCAGTGGA 

201 &CGGTGACCG 

251 bcTTTTGGAA 

301 [tagaagtcac 

351 3ctaagcgcg 

401 aggtatcttc 

451 ^AAAAGCJiAG 

501 2TCTTTGATC 

551 CTTAACGCTA 

601 TAGTGCTTAC 

€51 SCGCCGAAGG 

701 GTATCAGCCC 

751 &AGATCGCTT 

*eoi |gtcaaaggcg 

851 aaaattqqaq 



tagcgaacaa 



aaactcartt 



TCGCCGAAGT 
CATCTCGAAC 
"TGGCGGCCTG 
TAAGGCTTGA 
ACTTCGGCTT 
CATTGTTGTG 
AACTGCAATT 
GAGCCAGCCA 
AGAACATAGC 
CGGTTCCTGA 
TGGAACTCGC 
GTTGTCCCGC 
ATGTCGCTGC 
GTCATACTTG 
GGCCTCGCGC 
AGATCRCCAA 



901 rTGTCCCAAT 

951 rCTGTCAGTG 

1001 TAAATTTATT 

1051 rCACTACTTT 

1101 ATGAAGCGGC 

1151 SGAGAGGACC 

1201 &AGTCAAGTT 

1251 &TCGATTTCA 

1301 STACAACTCC 

1351 rCAAAGCTAA 

1401 ^TAGCAGACC 

1451 rTTACCAGAC 

1501 JCAACGAAAA 

1551 bGGATTACAC, 



gtcgccgTSg 



ATCGACTCAA 
CGACGTTGCT 
AAGCCACACA 
TGAAACAACG 
CCCCTGGAGA 
CACGACGACA 
TGGAGAATGG 
CGATCGACAT 
GTTGCCTTGG 
ACAGGATCTA 
CGCCCGACTG 
ATTTGGTACA 
CGACTGGGCA 
AAGCTAGACA 
GCAGATCAGT 
GGTAGTqSSC 



TCTTGTTGAA 
GAGAGGGTGA 
TGCACTACTG 
CTCTTATGGT 
ACGACTTCTT 
ATCTCTTTCA 
TGAGGGAGAC 
AGGAGGACGG 
CACAACGTAT 
CTTCAAAATT 
ATTATCAACA 
AACCATTACC 
GAGAGACCAC 
ATGGCATGGA 



~ctAGTAAAGG 
TTAGATGGTG 
AGGTGATGCA 
GAAAACTACC 
GTTCAATGCT 
CAAGAGCGCC 
AGGACGACGG 
ACCCTCGTCA 
AAACATCCTC 
ACATCACGGC 
AGACACAACA' 
AAATACTCCA 
TGTCCACACA 
ATGGTCCTTC 
TG&&CTATAC 



ctgaagaaga cttgcctagc 
CTATCAGAGG TAGTTGGCGT 
GGCCGTACAT TTGTACGGCT 
GTGATATTGA TTTGCTGGTT 
CGGCGAGCTT TGATCAACGA 
GAGCGAGATT CTCCGCGCTG 
TCATTCCGTG GCGTTATCCA 
CAGCGCAATG ACATTCTTGC 
TGATCTGGCT ATCTTGCTGA 
TAGGTCCAGC GGCGGAGGAft 
TTTGAGGCGC TAAATGAAAG 
GGCTGGCGAT GAGCGAAATG 
GCGCAGTAAC CGGCAAAATC 
ATGGAGCGCC TGCCGGCCCA 
GGCTTATCTT GGACAAGAAG 

AAA^aacttg cagttgaagg 
AGAAGAACTT TTCACTGGAG 
ATGTTAATGG GCACAAATTT 
ACATACGGAA AACTTACCCT 
TGTTCCtTGG CCAACACTTG 
TTTCAAGATA CCCAGATCAT 
ATGCCTGAGG GATACGTGCA 
GAACTACAAG ACACGTGCTG 
ACAGGATCGA GCTTAAGGGA 
GGCCACAAGT TGGAATACAA 
AGACAAACAA AAGAATGGAA 
TTGAAGATGG AAGCGTTCAA 
ATTGGCGATG GCCCTGTCCT, 
ATCTGCCCTT TCGAAAGATC 
TTGAGT TTGT AACAGCTGCT 



1 



AAATAAfea ct ctaga 

xur 



Figure 31 
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FLARE11-S3.seq Length: 1961 



CCGTACATTT 
GATATTGATT 
GCGAGCTTTG 
GCGAGATTCT 
ATTCCGTGGC 
GCGCAATGAC 
ATCTGGCTAT 
SGTCCAGCGG 
TGAGGCGCTA 
CTGGCGATGA 
SCAGTAACCG 
3GAGCGCCTG 
CTTATCTTGG 
SAAGAATTTG 



SLgAac: 




GTACGGCTCC 
TGCTGGTTAC 
ATCAACGACC 
CCGCGCTGTA 
GTTATCCAGC 

AT m CTTGCAG 

CTTGCTGACA 
CGGAGGAACT 
AATGAAACCT 
GCGAAATGTA 
GCAAAATCGC 
CCGGCCCAGT 
ACAAGAAGAA 
TCCACTACGT 



TBS C BH&fg 

TCGAGCGCCA 
GCAGTGGATG 
GGTGACCGTA 
TTTTGGAAAC 
GAAGTCACCA 
TAAGCGCGAA 
GTATCTTCGA 
AAAGCAAGAG 
CTTTGATCCG 
TAACGCTATG 
GTGCTTACGT 
GCCGAAGGAT 
ATCAGCGCGT 
GATCGCTTGG 
GAAAGGCGAG 



gcgaacaaaa 
flCCGAAGTAi' 
TCTCGAACCG 
GCGGCCTGAA 
AGGCTTGATG 
TTCG3CTTCC 
TTGTTGTGCA 
CTGCAATTTG 
GCCAGCCACG 
AACATAGCGT 
GTTCCTGAAC 
GAACTCGCCG 
TGTCCCGCAT 
GTCGCTGCCG 
CATACTTGAA 
CCTCGCGCGC 
ATCACj 



actcatttct 
"CGACTCAACr 



raacttgca gttgaaggaa aattggaggt 



ACGTTGCTGG 
GCCACACAGT 
AAACAACGCG 
CCTGGAGAGA 
CGACGACATC 
GAGAATGGCA 
ATCGACATTG 
TGCCTTGGTA 
AGGATCTATT 
CCCGACTGGG 
TTGGTACAGC 
ACTGGGCAAT 
GCTAGACAGG 
AGATCAGTTG 

ic&AGG TAgiaGgcaa 



GTTAATGGGC 
ATACGGAAAA 
TTCCtTGGCC 
TCAAGATACC 
GCCTGAGGGA 
ACTACAAGAC 
AGGATCGAGC 
CCACAAGTTG 
&CAAACAAAA 
5AAGA7GGAA 
FGGCGATGGC 
3TGCCCTTTC. 
3AGT TTGTAA 
ITAAGftctct 



CACTGGAGTT 
ACAAATTTTC 
CTTACCCTTA 
AACACTTGTC 
CAGATCATAT 
TACGTGCAGG 
ACGTGCTGAA 
TTAAGGGAAT 
GAATACAACT 
GAATGGAATC 
GCGTTCRACT 
CCTGTCCTTT 
GAAAGATCCC 
CAGCTGCTGG 



AATSSESSSaT 
ttcttttttt 
rjT GTTTACA 
CATGa aaqct 



agagcfr alCC 

aatcattttc 

CTTTTTATTT 
TTATAGAAAA 



GTCCCAATTC 
TGTCAGTGSA 
AATTTATTTG 
ACTACTTTCT 
GAAGCGGCAC 
AGAGGACCAT 
GTCAAGTTTG- 
CGATTTCAAG 
ACAACTCCCA 
AAAGCTAACT 
AGCAGACCAT 
TACCAGACAA 
AACGAAAAGA 
GATTACACAT 



cgcc ATGgct: 
TTG'fTGAATT 
GAGGGTGAAG 
CACTACTGGA 
CTXATGGTGT- 
GACTTCTTCA 
CTCXTTCAAG 
AGGGAGAGAC 



CAACGTATAC 
TCAAAATTAG 
TATCAACAAA 
CCATTACCTG 
GAGACCACAT 
GGCATGGATG 



AGTAAAGGAG 
AGATGGTGAT 
GTGAiGCAAC 
ASACTACCTG 
TCAAtGCTTT 
AGAGCGCCAT 
GACGACGGGA 
CCTCGTCAAC 
ACATCCTCGG 
ATCACGGCAG 
ACACAACATT 
ATACTCCAAT 
TCCACACAAT 
GGTCCTTCTT 
AACTATACAA 



TGGCCTAGI'C 
TTGTTCTATC 

ATTTACTAGT 
AGAAGGAGAG 



TATAGGAGGT 
AAGAGGGTGC 
ATTTTACTTA 
GTTATTTTCT 



Tl'TCaAAAAGA 
TATTGCTCCT 
CATAGACTTT 
TGCATTTATP 



Figure 32 
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s&&aCggM 

GCTTCATGCA 
TGGAGTTAGC 
AGCACGTGTG 
TCTCCTTCCT 
TAGTGGCAAC 
ACCTTACSGC 
CCGAGGGC&C 
TAAGGTTCTT 
CGGGCCCCCG 
GGCGGGATAC 
AGCACCTAGT 
CATTTGCTCC 
GTGCTTTCGC 
ACCGGAAATT 
GCCTGTCCAG 
CTACASACGC 
CTTACCGCGG 
CGTCATTGTT 
ACCTCCACGC 
CCCACTGCTG 
GCTGATCATC 
GCCTCACCAA 
CCTTTTGCTC 
TCCCCTCCCA 
AAACACCACT 
CATCCTGAGC 
ATAGCTTCCT 
AAGGATAACT 
CCAGCAGTAT 
ATCCATTCCC 
CTCACATTGG 
GTCAAGGTGA 
GAAAGAGAAT. 
CCCTTCCTCC 
GAAAAATTGG 
ATGGATTCGA 



^caccgcCgt 
ggcg&gttgc 
tcaccctcgc 
tcgcccaggg 
ccggcttaac 
taaacacgag 
acgagctgac 

CCCTCTCTTT 
CGCTTTGCAX 
TCAATTCCTT 
TTAACGCGTT 
ATCCATCGTT 
CCTAGCTTTC 
CGTTGGTGTT 
CCCTCTGCCC 
GGTTGAGCCC 
TTTACGCCCA 
CTGCTGGCAC 
TCTTCTCCGA 
GGCATXGCTC 
CCTCCCGTAG 
CTCTCGGACC 
CTAGCTAATC 
CTCAGCCTAC 
AGGGCASSTT 

CAGGATCGAA 
TATTCGTAGA 
TGTATCCATG 
AGCCAACCCT 
GTTCGATCGT 
GTTXAGGGAX 
CACTCTACCG 
TACCGAATCC 
GGGCTXTCTT 
ATTCAATTGT 
GCCATAGCAC 



STGGdtGAtS" 
AGCCTGCAAT 
GAGATCGCGA 
CATAAGGGGC 
ACCGGCGGTC 
GGTTGCGCTC 
GAC&GCCATG 
CAAGAGGATT 
CGAATTAAAC 
TGAGTTTCAT 
AGCTACAGCA 
TACGGCTAGG 
GTCTCTCAGT 
CTTTCCGATC 
CTACCGTACT 
TGGGATTTGA 
ATCATTCCGG 
AGAGTTAGCC 
GAAAAGAAGX 
CGTCAGGCTT 
GAGTCTGGGC 
AGOTACTGAT 
AGACGCGAGC 
GGGGTATTAG 
CTTACGCGTT 
CTTGCATGTG 
CTCTCCATGA. 
CAAAGCGGAT^ 
CGCTTCAGAT 
ACCCTATCAC 
GGCGGGGGGA 
AATCAGGCTC 
CTGAGTTATA 
TAAGGCAAAG 
TCCACACTAT 
CAACCGGTCC 
ATGGTTTCAT 



GGT^SmST 
CCGAACTGAG 



fflTCTWATC GAffCfiffffTTT CCftTSAflS 



ATGATGACTT 
TGTTCAGGGT 
GTTGCGAGAC 
CACCACCTGT 
CGCGGCATGT 
CACATGCTCC 
TCTTGCGAAC 
CIGCACGGGI 
ACTACTGGGG 
GTCAGTGTCG 
TCAATGCATT 
CCAGCTTGGT 
CGGCGGACTT 
ATAACGCTTG 
GATGCTTATT 
TGACGACCCG 
TCGCCCATTG 
CGTGTCTCAG 
CATCGCCTTG 
CCCTCCTTGG 
CAACCGTTTC 
ACTCACCCGT 
TTAAGCATGC 
GATTpATAGT 
'TCGGAATTGT 
TATTAGCCTG 
GTCAATCCCA 
GTAAGTCAAA 
GAACTGATGA 
TCCCTTCCGC 
GGGCGAGAAA 
TATGGATAGT 
TATCGAAAAT 
lTCTGTA 
gafccgacggt 



ttgcatgcct gcaggt(y GAA. TATAGC'i'CT!!' Cl"i"fC'I".l!A'l"r 

ATATTCAAAG ATAAGAGATA 

GAAAAAAAAA ATCAAAAAGA 
TTCTATTTCA CAATTTAAAC 
ATGAATAAAT GCAAGAAAAT 
AAAAAAGTCT ATGTAAGTAA 
GAAAGGAGCA ATAGCACCCT 



TATTATTTCA 
AAAATTTGAT 
TTAGCAAGAA 
TCAAAATAGA 
TCTTXTTCTA 
AATAAATAAA 
CAAGAAAATS. 
CCAGGATOgc 



AAGATAAGAG i 
TTTTTTTTTG ( 
GAGAAACAAG ' 
ATACTCAATC i 
TAATGTAAAC i 
AAGAAAAAAA ( 



RTT^TTC^Tf ^^ jj"f* TC 1 ***^" T " T ' < 7 A^ftACCTQt^ ' y 



CCTitCACGT 
GCGATCTTCT 
GCTGATACTG 
TTCGGCGCGA 
AAGCACTACA 
GCGTTAAGGT 
TCAAAGAGTT- 
y XGCTTTTGTC 



tetag< 



iACAAA 



AGTGGA* 
TCTTGTCCAA 
GGCCGGCAGG 
TTTTGCCGGT 
TTTCGCTCAT 
TTCATTTAGC 
CCTCCGCCGC 
AGCAAGATAG 



CMfATTTfiC 

TTCTTCCAAC 
GATAAGCCTG 
CGCTCCATTG 
TACTGCGCTG 
CGCCAGCCCA 
GCCTCAAATA 
TGGACCTACC 
CCAGATCAAT 



CGACTACC T' J 

TGATCTGCGC 
TCTAGCTTCA 
CCCAGTCGGC 
TACCAAATGC 
GTCGGGCGGC 
GATCCTGTTC 
AAGGCAACGG 
GXCGATCGTG 



'AGCGAflfCcT 
GACGGGTTTT 
CGCCCATTGT 
GGCCTCATCC 
TCCAAACTCA 
TTAACCCAAC 
GTCCGCGTTC 
CAAGCCCTGG 
ACCGCTTGTG 
GTACTCCCCA 
CGAGTCGCAC 
TCTCTAATCC 
GCCCAGCAGA 
TCACCGCTCC 
AGTTTCCACC 
GAAAAGCCAC 
CATCCTCTGT 
CCTCAGATAC 
TGGGCCTTCC 
CGGAAAATTC 
TCCCAGTGTG 
GTAAGCTATT 
GCGGATTTCT 
CAGTTGTTGT 
TCGCCACTGG 
CGCCAGCGTT 
TGCATTACTT 
CTTTCC^TCC 
GAGTTCGCCA 
CAAGCCTCTT 
ATAGAAAAAA 
CTTCCACCAC 
GTCCCCTCGA 
CTCAAGGCCA 
CAAATAATGG 
AGGATTGACT 
CGATTTTCCC 
atcgataagc 
TCAA T GATAT 
AGAAGAAGTC 
TATAGTAACA 
AAATACAAAA 
AACCTCTCCT 
AATACTAGTA 
CTTGATAGAA 
■&TA<5ftCyAGG 



I 

h 



5STGAWCS 
GCGAGGCCAA 
AGTATGACGG 
AGCGACATCC 
GGGACAACGT 
GAGTXCCATA 
AGGAACCGGA 
TATGTTCTCT 
GCTGGCTCGAj^ 
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' &GATACCTGC 
CGCTTAGCTG 
GACTTCTAC& 
CCAAAAGGTC 
GTCACCGTAA 
CACTGCGGAG 
GCTCGATGAC 
RCCC3CTTCCC 
TTCGCCCGGA 



AAGAATGTCA 
GATAACGCCA 
GCGCGGAGAA 
GTTGATCAAA 
CCAGCAAATC 
CCGTACAAAT 
GCCAACTACC 
TCATGgATCC 
GTTCGCTCCC 



TTGCGCTGCC 
CGGAATGATG 
TCTCGCTCTC 
GCTCGCCGCG 
AATATCACTG 
GTACGGCCAG 
TCTGATAGTT 
CTCCCTACAA 
AGAAATATAG 



&TTCTCCAAA 
TCGTCGTGCA 
TCCAGGGGAA 
TTGTTTCATC 
TGTGGCTTCA 
CAACGTCGGT 
GAGTCGATAC 
CTGTATCCAa 
CCATCCCTGC 



i ICAETCCCACG AfyfTCTTAT CCATTCTCAT TGAACGACGG 



ttgggtaccg agctcgaatt cctgcagccc 
' ACATTTCTTT TCAATTTCCA 



aactggggct j 
ttccacgccc 
tcttaggaac 

ACTTCATTTC 
BGACAGAATT 
CTCTGTAGAA 
TGCATTGCAG 
rCTCTCATGG 
&.CAGAGAAAA 
kCTCACAGAG 
CTAATATATA 
^GATAATCGA 
TATTGCTACC 
PCAACACAAC 
SAAAGCCAGT 
GATAAGCTCA 
rGGGAAGTTT 
SGAAAAGGTT 
&GGAAGAGGG . 
ftAATAAGTCG 
rCGAAAAGGA 
fcCCGAGAAAG 
TTGGTAAAAG 
TAGAACATGA 
3TGGAAGAAA 
3AATTGAACG 
3AGGGACAGG 
^AAAACCCAA 



TTTTTTGAGA CCTCGAAACA 
ACATACAAGA AAAAGGATAA 
ATTTATGAAT TTCATAGTAA 
TCGAACTTGC TATCCTCTTG 
AGAATGATTC ATTCGGATCG 
AATCCATGTT CCATATTTGA 
TACAATCCTC TTCCTGCTGA 
AATGGAGGAC TGG?GCOGAC 
CCGGGATCGC TAACTAATAG 
GAAATAGATA TctagctagA 
AATTGAAAAG AACTGXCTTT 
GCGGGTCTTA TGCAATCGAT 
ATAGGTCATC GAAAGGATCT 
TAGAAAATGG ATTCCTATTT 
CATTAACCCG TCAATTTTGG 
CGGGAAGAAA TTGGAATGGA 
CTCTATTGAT GCAAAC6CTG 
AAAAAICGAA ATGAAATAAAT 
AAGATAGAAG AGCCCAGATT 
TCCTTCTGAT TCTCAAAGAA 
ATTTCTTCTT ATTATAAGAC 
AACAATCTTC TCCTTIAATC 
AAACGTGACT CAATTGGTCT 
GGGCGAAGAC TCTCGAACGA 
AGGAGCCGTA TTAGGTGAAA 
AAGGGTGACT TATCTGTCGA 



CTCT6CCTTA Cf 



gatCjT TAdxa, 
TTCAAGAGTT 
TGAAATGGAC 
TGGTAGCCCT 
TAGAAATCCA 
CCSAATAGGC 
ATATGAGGAC 
AGAGGCTTGA 
GCCCCCTTTC 
AGTTCATCAC 
AATAGXACTA 
AATAGAAACA 
TCTGTATACT 
CGGATCATAT 
CGGACGACTC 
GAAGAGTGCC 
ATCCAATTCG 
ATAATATAGA 
TACCTAGAGG 
TAAAGAATAA 
CCAAATGAAG 
TGAGGGGCAA 
GTGATTTGAT 
ATAAATGGAA 
TAGTTAGTCT 
GGAAAAGGAT 
ATCTCATGTA 
CTTTTCCACT 



TTGCAGTTCG ' 
CAACAATGGT 
GCCGAAGTTT 
AAGCCTTACG 
GGCCGCCATC 
TCGAGATGGC 
TTCGGCGATC 
GCGCTTCgTA 
CCCCTCACGJ 
CGGGGGRi 



TCTTATCTGT 
AAATTCCTTC 
CCCATTAACT 
TGTCCTACCG 
AAAGATTGAC 
CCAACTACGT 
CCTCTGTGCT 
TCCTCGGTCC 
GGAAGAAAGA 
CTAACTAATA 
ACTAATATAT 
TTCCCCGTTC 
AGATATCCCT 
ACCAAAGCAC 
TAACCGCATG 
GGATTTTTCT 
TTCATACA&A 
ATAGGGATAG 
AGCAAAAAAA 
AAATGGAAAC 
GGGGATTGAT 
CCGCATATGT 
AGTGTTCAAT 
TCGGGACGGA 
CCCTTCGAAA 
CGATTCTGTA 
ATCAACCCCfl 
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Figure 34A 



GGGAACGGAT 
GCTTCATGCA 
TGGAGTTAGC 
AGCACGTGTG 
TCTCCTTCCT 
TAGTGGCAAC 
ACCTTACGGC 
CCGAGGGCAC 
TAAGGTTCTT 
CGGGCCCCCG 
GGCGGGATAC 
AGCACCTAGT 
CATTTGCTCC 
GTGCTTTCGC 
SCCGGAAATT 
GCCTGTCCAG 
CTACAGACGC 
CTTACCGCGG 
CGTCATTGTT 
ACCTCCACGC 
CCCACTGCTG 
GCTGATCATC 
GCCTCACCAA 
CCTTTTGCTC 
rCCCCTCCCA 

aaacaccact 
catcctgagc 
&tagcttcct 

aAGGATAACT 
2CAGCAGTAT 

axccaxrccc 

CTCACATTGG 
3TCAAGGTGA 
3AAAGAGAAX. 
3CCTTCCTCC 
S&AftAATTGG 
HTGGATTCGA 



TCACCGCCGT 
GGCGAGTTGC 
TCACCCTCGC 
TCGCCCAGGG 
CCGGCTTAAC 
TAAACACGAG 
ACGAGCTGAC 
CCCTCTCTTT 
CGCTTTGCAT 
TCAATTCCTT 
TTAACGCGTT 
A7CCATCGTT 
CCTAGCTTTC 
CGTTGGTGTT 
CCCTCTGCCC 
GGTTGAGCCC 
TTTACGCCCA 
CTGCTGGCAC 
TCTTCTCCGA 
GGCATTGCTC 
CCTCCCGTAG 
CTCTCGGACC 
CTAGCTAATC 
CTCAGCCTAC 
AGGGCAG3TT 
TCCCGTTCGA 
CAGGATCGAA 
TAXTCGTAGA 
TGTATCCATG 
AGCCAACCCT 
GTTCGATCGT 
GTTTAGGGAT 
CACTCTACCG 
TACCGAATCC 
GGGCTTTCTT 
ATTCAATTGT 
GCCATAGCAC 



ATGGCTGACC 
AGCCTGCAAT 
GAGATCGCGA 
CATAAGGGGC 
ACCGGCGGTC 
GGTTGCGCTC 
GACAGCCATG 
CAAGAGGATr 
CGAATTAAAC 
TGAGTTTCAT 
AGCTACAGCA 
TACGGCTAGG 
GTCTCTCAGT 
CTTTCCGATC 
CTACCGTACT 
TGGGATTTGA 
ATCATTCCGG 
AGAGTTAGCC 
GAABAGAAGT 
CGTCAGGCTT 
GAGTCTGGGC 
AGCTACTGAT 
AGACGCGAGC 
GGGGTATTAG 
CTTACGCGTT 
CTTGCATGTG 
CTCTCCATGA 
CAaAGCGGAT* 
CGCTTCAGAT 
ACCCTATCAC 
GGCGGGGGGA 
AATCAGGCTC 
CTGAGTTATA 
TAAGGCAAAG 
TCCACACTAT 
CAACCGGTCC 
ATGGTTTCAT 



GGCGATTACT 
CCGAACTGAG 
CCCTTTGTCC 
ATGATGACTT 
TGTTCAGGGT 
GTTGCGAGAC 
CACCACCTGT 
CGCGGCATGT 
CACATGCTCC 
2CTTGCGAAC 
CTGCACGGGT 
ACTACTGGGG 
GTCAGTGTCG 
TCAATGCATT 
CCAGCTTGGT 
CGGCGGACTT 
ATAACGCTTG 
GATGCTTATT 
TGACGACCCG 
TCGCCCATTG 
CGTGTCTCAG 
CATCGCCTTG 
CCCTCCTTGG 
CAACCGTTTC 
ACTCACCCGT 
TTAAGCATGC 
GATTCATAGT 
TCGGAATTGT 
TATTAGCCTG 
GTCAATCCCA 
GTAAGTCAAA 
GAACTGATGA 
TCCCTTCCCC 
GGGCGAGAAA 
TATGGATAGT 
TATCGAAAAT 
aAAATCTGTA.. 



AGCGATTCCT 
GACGGGTTTT 
CGCCCATTGT 
GGCCTCATCC 
TCCAAACTCfi 
TTAACCCAAC 
GTCCGCGTTC 
CA&GCCCTGG ft) 
ACCGCTTGTG V* 
GTACTCCCCA * 
CGAGTCGCAC § 
TCTCTAATCC ^ 
GCCCAGCAGA * 
TCACCGCTCC ^ 
AGTTTCCACC 5^. 
GAAAAGCCAC , J\> 
CATCCTCTGT v£ 
CCTCAGATAC 



3ATCTAART C GAGCAGGTflX C^TffAAS&A, gaccgacggC 

_ TCCWCTWT 



TGGGCCTTCC 
CGGAAAATTC ^ 
TCCCAGTGTG ^ 
GTAAGCTATT 
GCGGATTTCT ^ 
CAGTTGTTGT "**C 
TCGCCACTGG 
CGCCAGCGTT $ 
TGCATTACTT ^> 
CTTTCCTTCC 
GAGTTCGCCA 
CAAGCCTCTT 
ATAGAAAAAA 
CTTCCACCAC 
GTCCCCTCGA 
CTCAAGGCCA 
CAAATAATGG 
AGGATTGACT 



&AACMAAA& 
fcAAAGAAAGG 



AAATgcAAGA 
gtctatgtaa 
AGCAATAGCA 



gcC TffATTTG" 
CAAACTC3AG 
M3GGCAGATT 
MTCGCCAATT 
2ATCTTCAAT 
FGTTTGTCTG 
2TTGTGGCCG 
2GATCCT6TT 
PTGTAGTTCC 
CTCAGGCATG 
WPCTTGAAAA 
SRaGG&ACAG 
rCCGTATGTT 



GTAAAATACT 
CCCTCTTGAT 



TATAGXTCAT 
AAGGACCATG 
GTGTGGACAG 
GGAGTATTTT 
GTTGTGTCTA 
CCGTGATGTA 
AGGATGTTTC 
GACGAGGGTG 
CGTCGTCCTT 
GCGCTCTTGA 
GCATTGAACA 
GTAGTTTTCC 
GCATCACCTT 



CCATGCCATG 
TGGTCTCTCT 
GTAAXGGTTG 
GTTGATAATG 
ATTTTGAAGT 
TACGTTGTGG 
CGTCCTCCTT 
TCTCCCTCAA 
GAAAGAGATG 
AGAAGTCGTG 
CCATAAGAGA 
AGTAGTGCAA 
CACCCTCTCC 



AGTAAATAftA 
AGAACAAGAA 
JAGGCCAGGA 



TGTAATCCCA 
TTTCGTTGGG 
TCTGGTAAPA 
GTCTGCTAGT 
TAGCTTTGAT 
GAGTTGTAGT 
GAAATCGATT 
AGTTGACTTC 
GTCCTCTCCT 
CCGCTTCATA 
AAGTAGTGAC 
ATAAATTTAA 
ACTGACAGAA 



atcgataagc 
TCTATAATGT1 ^ 
TAAAAAGAAA ^ 
A ATGATTATTf Vg- 
Tdpctctaga 
GCAGCTGTTA 
ATCTTTCGAA 
GGACAGGGCC 
TGAACGCTTC 
TCCATTCTTT 
TGTATTCCAA 
CCCTTAAGCT 
AGCACGTGTG 
GCACGTATCC 
TGATCTGGGT 
AAGTGTTGGC 
GGGTAAGTTT 
AATTTGTGCC 
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2751 
2801 
2851 
2901 
2951 
3001 
3051 
3101 
3151 
3201 
3251 
3301 
3351 
3401 
34S1 
3501 
3551 
3601 
**€51 
3701 
3751 
3801 
3851 
3901 
3951 
4001 
4051 
4101 
4151 
4201 
4251 
4301 
4351 
4401 
4451 
4501 
4551 
4601 
4651 
4701 
4751 
4801 
4851 
4901 
4951 
5001 
S051 
5101 
5151 
5201 
5251 



1CATTAACATC 



ACCATCTAAT 



attcttccaa 
agataagcct 
gcgctccatt 
ttactgcgct 
tcgcc&gccc 
cgcctcaaat 
c1ggacctac 
gccagatcaa 

ATTGGGCTGC 
ACGGAATGAT 
ATCTCGCTCT 

agctcgccgc 

CAATATCACT 
TGTACGGCCA 
CTCTGATAGT 



ctrcttcaga 



CTGATCTGCG 
GTCTAGCTTC 
GCCCAGTCGG 
GTACCAAATG 
AGTCGGGCGG 
AGATCCTGTT 
CAAGGCAACG 
TGTCGATCCT 
CATTCTCCAA 
GTCGTCGTGC 
CTCCAGGGGA 
GTTGTTXCAr 
GTSTSGCTTC 
GCAACGTCGG 
TGfrGJCCSATA 



£EBggcgacc 

tggtgai'ctc ' 

CGCGAGGCCA 
AAGTATGACG 
CAGCGACATC 
CGGGACAACG 
CGAGTTCC&T 
CAGGAACCGG 
CTATGTTCTC 
GGCTGGCTCG 
ATTGCAGTTC 
ACAACAATGG 
AGCCGAAGTT 
CAAGCCTTAC 
AGGCCGCCAT 
TTCGAGATGG 



tccaattttc 

AGCGATCTTC 
GGCTGATACT 
CTTCGGCGCG 
TAAGCACTAC 
AGCGTTAAGG 
ATCAAAGAGT 
TTGCTTTTGT 
AAGATACCTG 
GCGCTTAGCT 
TGACTTCTAC 
TCCAAAAGGT 
GGTC&CCGTA 
CCACTGCGGA 
CGCTCGATGA, 
CAC< 



■G^aJ 
cttcaactgc 




2TACATTXCT TTTCAATTTC 
CCTTTTTTGA GACCTCGAAA 
R.CACATACAA GAAAAAGGAT 
rCATTTATGA ATTTCATAGT 
rTTCGAACTT GCTATCCTCT 
&AAGAATGAT TCATTCGGAT 
ilGAATCCATG . TTCCATATTT 
36TACAATCC TCTTCCTGCT 
ftAAATGGAGG ACTGGTGCCG 
&GCCGGGATC GCTAACTAAT 
rAGAAATAGA TATctagcta 
3AAATTGAAA AGAACTGTCT 
XGCGGGTCT. TATGCAATCG 
&CATAGGTCA TCGAAAGGAT 
3TTAGAAAAT GGATTCCTAT 
2ACATTAACC CGTCAATTTT 
TTCGGGAAGA AATTGGAATG 
rTCTCTATTG ATGCAAACGC 
3GAAAAATCG AAATGAAATA 
CGAAGATAGA AGAGCCGAGA 
GATCCTHCTG ATTCTCAAAG 
AGATTTCTTC TTATTATAAG 
AG&ACAATCT TCTCCTTTAA 
SAAAACGTGA CTCAATTGGT 
&AGGGCGAAG ACTCTCGAAC 
ICG TA2TAGGTGA 
CTTATCTGTC 



TTCTTGTCCA 
GGGCCGGCAG 

ATTXCGCTCfl 
TTTCATTTAG 
TCCTCCGCCG 
CAGCAAGATA ^ 
CAAGAATGTC 
GGATAACGCC « 
AGCGCGGAGA d 
CGTTGATCAa 
ACCAGCAAA3 
GCCGTACAA2 
CGCC 

etaggcaagt c 
AGTCATGCTtl - 
TTCTAGtGGG ' 
GCCCCCTCAC , 
GGCGj 



CGGGGGAQ 




CATTCAAGAG TTTCTTATCT 
CATGAAATGG ACAAATTCCT 
AATGGTAGCC CTCCCATTAA 
AATAGAAATC GATGTCCXAC 
TGCCTAATAG GCAAAGATTG 
CGATATGAGG ACCCAACTAC 
GAAGAGGGTT^GAt^PCTGTG 
GAGCCCCCTT "fCTCCTCGGT 
ACAGTTCATC ACGGAAGAAA 
AGAATAGTAC TACTAACTAA 
gAAATAGAAA CAACTAATAT 
TTTCTGTATA CTTTCCCCST 
ATCGGATCAT ATAGATATCC 
CTCGGACGAC TCACCAAAGC 
TTGAAGAGf G CCTAACCGCA 
GGATCCAATT CGGGATTTTT 
GAAXAATATA GATTCATACA 
TGTACCTAGA GGATAGGGAT 
AATAAAGAAT AAAGCAAAAA 
TTCCAAATGA AGAAAT6GAA 
AATGAGGGGC AAGGGGATTG 
ACGTGATTTG ATCCGCATAT 
TCATAAATGG ■ AAAGTGTTCA 
CTTAGTTAGT CTTCGGGACG 
GAGGAAAAGG ATCCCTTCGA 
AAATCTCATG TACGATTCTG 
GAC'j-xl'xCCA CTATCAACCC 



GTTTCCACGC 
TCTCTTAGGA 
CTACTTCATT 
CGAGACAGAA 
ACCTCTGTAG 
GTTGCATTGC 
CTTCTCtCAT 
CCACAGAGAA 
GAACTCACAG 
TACTAATATA 
ATAGATAATC 
TCTATTGCTA 
CTTCAACACA 
ACGAAAGCCA 
TGGATAAGCT 
CTTGGGAAGT: 
GAGGAAAAGG 
AGAGGAAGAG 
AAAAATAAGT 
ACTCGAAAAG 
ATAC CGAGAA 
GTTTGGTAAA 
ATTAGAACAT 
GAGTGGAAGA 
AAGAATTGAA 
TAGAGGGACA 
CAAAAAACCC 
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